February 15, 2018

How to Leverage Paxata for Timely Anti-Money Laundering Investigations

By Mike White

One of the biggest hurdles faced by many enterprises today is simply getting data ready for analytics and decision-making. Bringing together multi-structured datasets from diverse sources, profiling it, cleansing it, and shaping it are just a few of the challenging tasks that plague modern enterprises across all industries.

To illustrate this point, let’s take a look at the... Read More

January 12, 2018

Gartner: Key Trends of Modern Data Quality Tools

By Farnaz Erfan

The accelerating rate of digital business, growing complexity of incoming data, and limitations with existing data quality tools are producing a complex data landscape and generating heightened data quality requirements.

According to Gartner, “Data quality tools are experiencing fundamental changes in eight key areas: audience, governance, data diversity, latency, analytics, intelligence, deployment and pricing.“*

As a preview, here are... Read More

November 15, 2017

Top 3 Criteria for Selecting a Data Preparation Tool

By Farnaz Erfan

Organizations looking to add modern data preparation to their analytics technology arsenal have multiple choices – ranging from line of business, self-service solutions to modules from legacy, IT-centric data management platforms.

Diverse use cases, varied skill levels, and unique business requirements make the data preparation tool selection process complex and confusing. Knowing the correct evaluation and selection criteria... Read More

November 2, 2017

How Disruptive Technology is Changing the Game in Financial Crimes Compliance

By Farnaz Erfan

In my previous blog post, we discussed the risks and limitations of current information management practices with respect to compliance reporting, as revealed by the findings from a recent benchmark study conducted by FIMA in partnership with Paxata.

Today I’d like to talk about disruptive technologies that enables financial institutions (FIs) to overcome these limitations.

A full two-third (67%) of... Read More

October 20, 2017

The Risks and Limitations of Current Information Management Practices in Financial Crimes Compliance Initiatives

By Farnaz Erfan

In my earlier blog post, I discussed some of the findings from a recent FIMA benchmark study regarding the current state of information management for financial institutions (FIs). The statistics showed that FIs continue to be burdened by compliance-related issues and remain highly dependent on IT departments to address them.

Today I’d like to talk about the shortcomings... Read More

October 9, 2017

3 Enterprise Requirements Where Data Prep with Excel is Less Than Stellar

By Farnaz Erfan

Excel has long been the tool for business analysts to perform lightweight data preparation tasks – identifying outliers and errors, aggregating values, and combining data into one spreadsheet for analytics.  However, all too often, business users waste time using Excel to manually profile and process data.

Truth is that Excel is inadequate for enterprise projects that comprise large-scale... Read More

October 9, 2017

Top Financial Services Priorities

The People, Practices and Processes Involved

By Farnaz Erfan

Paxata and FIMA (Financial Information Management Association) collaborated on a benchmark study examining the current state of information management and governance for financial institutions (FIs).

The research outlines an astounding view of compliance pressures and illustrates how the uncertainty of meeting future regulatory burdens are inducing FIs to invigorate their information practices.

While driving... Read More

October 4, 2017

3 Major Trends at Strata New York 2017

By Farnaz Erfan

Enterprise data architects, data engineers, and business leaders from around the globe gathered in New York last week for the 3-day Strata Data Conference, which featured new technologies, innovations, and many collaborative ideas.

From speaking with attendees at the Paxata exhibit booth, partaking in networking events, and attending sessions, we observed three major trends:

1) Convergence of technologies

Many announcements... Read More

September 28, 2017

Paxata launches Self-Service Data Preparation on Azure HDInsight to accelerate Data Prep

By Pranav Rastogi Program Manager, Azure, Big Data

We are pleased to announce the expansion of HDInsight Application Platform to include Paxata, a leading self-service data preparation offering. You can get this offering now at Azure Marketplace and read more on the press announcement by Paxata.

Azure HDInsight is the industry leading fully-managed cloud Apache Hadoop and Spark offering, which gives you... Read More

September 26, 2017

Putting the Promise of the Data Lake in Reach

By Rik Tamm-Daniels, VP Technology and Partnerships, Paxata

Agility, elastic scaling, and unlimited analytic potential. The promise of a data lake is very appealing to organizations that were thwarted and obstructed by legacy infrastructure in their quest to drive business via data. Regrettably, the reality of the actual journey needed to reach the promised land revealed to be quite daunting... Read More

August 30, 2017

Converting from a UNIX Timestamp

A Unix timestamp can be converted in Paxata using a Compute statement.



The steps for converting from Unix timestamp to Gregorian date are as follows:

Click on the Compute icon in the left-hand Tools bar.
Give your new column a name, such as “Transaction Date” Enter this formula: Dateadd(Datevalue("1970-01-01", "yyyy-MM-dd"), @Unix Time Stamp@, "seconds")                (replace @Unix Time Stamp@ with your actual column name) Click... Read More
August 17, 2017

Creating a Sequence Number based on Multiple Columns

The Auto Number tool is a handy utility.  This tip of the day goes through the very simple process of using Auto Number to create a sort index based on the value in multiple columns.


Assume your data looks like this.  There is a Group column with values A, B and C.  There is also a GroupSeqNo column which contains... Read More

August 7, 2017

Deriving Columns & Values from Key-Value Pairs

Key-value pairs are a pretty common in the world of data prep.  A key-value pair is a set of two linked data items where one represents an identifier and one represents the value.  An easy example is Age:21.  The “key” is age and the value is 21.  Converting a key-value pair to columns and data can be handled easily... Read More

July 28, 2017

The Unbelievable Truth About Financial Crimes Compliance Reporting

By Farnaz Erfan

While financial crimes compliance (FCC) has long been one of the central motivators for developing stronger data management practices, the intensity of regulations has increased over time, thereby elevating expectations for quality and speed in compliance reporting.

Meanwhile, perpetrators are not standing still. Rather, they are moving forward with new technologies to pursue their calling. In fact, fraud... Read More

July 18, 2017

Getting a count of distinct values

There is often a need to compute the distinct number of values in a column.  To explore the data, a Filtergram is a quick and easy way to find out the distinct number of values in a column.

When there is a need to have that value in the dataset, there is a simple approach to do so.  The approach... Read More

July 17, 2017

7 Financial Services Initiatives Where Self-Service Data Prep is Vital

By: Farnaz Erfan

The ability to effectively and efficiently manage the vast and disparate range of corporate, customer and business information is rapidly emerging as a critical objective for financial services companies —regardless of their respective market and geographic focuses.

While data management has always been a critical component of financial services companies’ initiatives to better understand their customers, manage risks,... Read More

July 17, 2017

Using DATEVALUE() when there is a T separating the date and the time

Today’s Tip of the Day focuses on a very particular case of using the DateValue() function.  Specifically, we will discuss excluding “filler” values in a string while converting the non-filler values to a date and time.

A common example of a date string with a filler value might be:

An uncommon example might be a string which looks like this:

dep.... Read More

July 12, 2017

Delete Filter & Mute Filter buttons in the Steps List

Sometimes, a step will have unwanted filters.  There are 3 handy buttons to help clean out unwanted filters.

If you are not sure what filter(s) are applied, click on the Filters link (A).  This will show you what filters are applied.

If you are not sure what impact the filter has on your data, click on the MUTE FILTER button (B).

If... Read More

Tags: TipoftheDay,
June 28, 2017

Computing % of Total

Running totals and windowed aggregates are best handled in a Business Intelligence (BI) tool which allows for sophisticated aggregation capabilities.  However, there are times when you need an aggregate of a column at the data prep layer because other formulas rely on this aggregate value or the data isn’t going to be used in a BI tool. This tip... Read More

June 19, 2017

The First Two Steps of a New Project

By Dan Howell  June 19, 2017

When starting a new project, it is helpful to follow two tips to make it easier to:

See only what you need to see, eliminating a lot of horizontal scrolling Isolate particular rows by row number to spot check rows in a large dataset Return to the original sort order

When starting a new project – especially when... Read More

June 16, 2017

Interactive Column Search

By Mike White   June 16, 2017


Working with wide data can be a challenge when you’re trying to find that one column in the midst of hundreds or even thousands. It’s even more frustrating when the headers consist of cryptic codes resulting from legacy systems, character-limited headers, or an abbreviation in a foreign language. Fortunately, Paxata provides a clever way... Read More

June 14, 2017

Using the Fill Down feature

Using the Fill Down Feature By Dan Howell   June 14, 2017

Use fill down to populate blank cells with data from the cell above. The fill down continues for blank cells until the next populated cell is encountered in the column. This feature is commonly used with data that has header rows in the data (as shown in the example below),... Read More

June 8, 2017

Apply Project Steps to a Different Dataset

By Mike White  June 8, 2017


Although Paxata makes it very easy to build out a set of transformations, computations, lookups, and aggregations from scratch, you may want to leverage an existing data prep project on another dataset. For you this could be a new set of transactions, events from a different period, a different customer or product file, or... Read More

April 20, 2017

How Paxata’s Values-Based Beliefs Lead to a “Best-Places To Work” Company

By Prakash Nanduri

As our world experiences massive transformation, it’s easy to get caught up in the buzz about AI, autonomous vehicles, robots and drones. But amid all the noise, it’s worth remembering that one of the core aspects of company success is its people and thriving values-driven culture.

We don’t hesitate to say:  It’s our values that have led Paxata... Read More

April 19, 2017

Data-Driven or Information Inspired

By Prakash Nanduri, Co-Founder and CEO, Paxata

From its inception, Paxata has always been focused on transforming the lives of you the business consumer:  the business analyst, the marketing analyst, the financial planning and analysis guru, and you the intelligence analyst. The hardest part of your jobs is turning raw data into information that is ready for analytics and insight.

You... Read More

April 19, 2017

Seeing is Believing: MicroStrategy fans get ready for warp speed

By Rik Tamm-Daniels   April 19, 2017

I recently celebrated 2 years with Paxata, and in that time, I’ve spoken with many partners and customers about core data challenges that have made enterprise data prep a strategic and transformative initiative for so many companies. And I’ve shown them how fast and easy Paxata makes everything from data profiling to data integration... Read More

March 28, 2017

How Easy is Extremely Easy – Paxata + Amazon Connect

Paxata Makes Customer 360 and Customer Insight Analytics with Amazon Connect Easy

Contributing author: Rik Tamm-Daniels, VP of Technology and Partnerships

We at Paxata are proud to be a long-standing member of the Amazon AWS partner family and excited to be part of the Amazon Connect launch!  In true Amazon fashion, Amazon is disrupting another long-standing industry with both a mature... Read More

March 15, 2017

Bringing the Power of the Data Lake to the Business

By Jen Benito, Data Intelligence Principal Consultant, Trace3

I come from the world of traditional RDBMS Data Warehouses and Business Intelligence, where I spent the first 20 years of my career.  Today, as a member of Trace3’s Data Intelligence team, I hear the phrases ‘business insights’ and ‘business outcomes’ as key objectives from my customers all the time, but don’t... Read More

March 14, 2017

Opinion Tips for making your data lake thrive

Originally published on Information Management
Manan Goel is a senior director at Paxata

Big data offers tremendous opportunities to outsmart your competition and obtain insights on your business. By transforming big data into actionable information, you can open your organization up to new opportunities by identifying additional markets and customer segments, and by capitalizing on product innovation.

One of the leading... Read More

December 9, 2016

Reinventing the Information Pipeline

Contributing author: Rik Tamm-Daniels, VP of Technology and Partnerships

An attendee from Pinterest sat next to me at MapR’s Big Data Everywhere event in Redwood City, CA. He sketched out his problem on a notepad: his financial team is trying to find ways to cut costs, comparing billing statements with actual usage of something they rely on. The problem? The finance team is evaluating bills from... Read More

November 7, 2016

Power to the people – the end of data discrimination

The following post is written by Paul Urban from Eccella Consulting, a Paxata Transform Partner. Come see Paul and more great people from the Eccella Consulting team at Booth #401 at the Tableau Customer Conference 2016. 

Over 13,000 people will be pouring into downtown Austin early November for TC16, schlepping their laptop bags in between different buildings and conference rooms with... Read More

November 3, 2016

Nobel Prize goes to…Tableau

A 5 year Tableau journey

My Tableau journey started in 2011 with the Tableau Customer Conference in Las Vegas, their first customer conference outside of Seattle. It was also the first time the customer conference crossed the 1,000 attendee mark. There were less than 10 sponsors; all of us were excited to be part of a movement. Fast forward five... Read More

October 5, 2016

Why Spark isn’t enough (by itself)

This is the third in a series of blog posts inspired by a recent presentation presented at DataVersity 16 in Chicago by Shachar Harussi. Shachar and I discussed the lessons the Paxata development team learned while building the distributed Apache Spark architecture for the Paxata platform. Here is a peek into what we talked about – refer to this... Read More

September 26, 2016

Journey to Apache Spark

In a previous blog post, I mentioned that Shachar Harussi and I discussed the lessons we learned building the Apache Spark-based architecture for the Paxata platform at DataVersity in Chicago this week. I can’t wait to talk about this week at Strata NYC. Come by booth #301 and find me to ask me more about these topics!

Here is a... Read More

September 21, 2016

Building Paxata with Apache Spark

Shachar Harussi (a senior distributed systems engineer at Paxata) and I are happy and grateful that we got to share the lessons we learned while building the Apache Spark-based architecture for the Paxata platform today at Dataversity in Chicago. Thank you so much to the wonderful folks at Enterprise Dataversity!

I am excited to talk about this more next week at Strata NYC. Come... Read More

June 27, 2016

Successfully leveraging a data lake across multiple Hadoop distros

Comparing Hadoop distribution vendors is a popular topic among Big Data writers. In many organizations, however, the comparison is happening inside of their own walls, with test clusters running multiple distributions side-by-side, serving multiple internal needs.

Every organization has multiple databases, and with the growing popularity of Hadoop and technologies, more than one Hadoop distribution as well. Analysts access data stored... Read More

May 31, 2016

Meetup: Paxata Prep and Tableau Viz

Meetups @ Paxata: Data Prepsters Bay Area with Tableau and TAM Group

Tableau partners, experts, and novices gathered at the Paxata offices to eat sushi, drink beer, and talk about data visualization and data preparation. We met sales analysts, economists, data scientists, marketing analysts, and developers over a lively conversation in our Redwood City HQ.

Interested in Tableau and Paxata but... Read More

May 23, 2016

Survey Says: Retail Customer 360

Part 2 of 2: Retail business teams are constantly struggling to connect data – internal data to external data. In a previous post (part 1), we discussed and solved a huge data integration challenge for a team of retail data curators and business analysts.

Now the analyst team has a focused, detailed, contextual, and accurate dataset of loyal customers, their purchase history,... Read More

May 16, 2016

Top 5 best takeaways from TDWI

At TDWI in Chicago this week, I had the honor to be part of Mark Madsen’s Data Integration Innovation class. He is an incredible speaker and creates presentations with great slides. Each slide is interesting, poignant, and funny. I thought I would share my top five favorite slides and some of my thoughts:

Number 5: I completely fell for the... Read More

May 10, 2016

Shopping for high quality data with retail data stewards

Understanding the customer journey in today’s omnichannel retailing world is anything but simple: people surf websites on their computers, then casually shop on their phone, check social media, walk into brick and mortar stores to shop in person. Omnichannel rinse and repeat! How can a retailer track a customer’s journey across all of these experiences? How can the business... Read More

May 3, 2016

Paxata and Data Robot in Portland

Paxata and DataRobot co-hosted an open conversation about data in true Portland style. What better place to talk about analytics, data science, and data preparation in the enterprise than a whiskey library!

Data architects, BI experts, and directors of analytics from logistics, manufacturing, and retail companies talked about their common desire to transition into data companies. They share a common... Read More

April 24, 2016

Connecting with the Spark Community

On Tuesday night, the Paxata Lab was packed with people who came from all over the Bay Area to participate in the Spark Workshop on the Peninsula Meetup of the SF Big Analytics group, the first in a four-part series. As long-time advocates of Spark (we built the Paxata platform on Spark and released to our customers with release 1.0.0 back in 2014),... Read More

April 13, 2016

Philosophies and algorithms to untangle retail data

We work with a global retailer who has thousands of different datasets they need to analyze – from simple spreadsheets to impossible-to-parse data in complex JSON and XML files, and everything in between. Where does it come from?

Campaigns (marketing campaign, coupons / sales) Inventory (assets, product codes, merchandise lists, prices) Online activity (click traffic, social media mentions) Customer service (service tickets, call logs) Transactions (online... Read More
April 1, 2016

Who has the Glengarry Leads??

I am definitely dating myself with the reference to “Glengarry leads” but any good marketing person is sadly very familiar with that movie and has probably even heard these words from their field team: “where are the leads from the show?”

The Strata + Hadoop World conference just ended yesterday and as I watched our booth get dismantled, this blog topic popped in... Read More

March 23, 2016

Is it that time again?

A year has flown by, and Strata San Jose is upon us. As usual, Paxata is looking forward to connecting with some of the brightest people who are coming together to share and learn about Hadoop, Spark, Impala and other big data innovations.

Paxata has a few fun things planned so, if you happen to be in the Bay Area,... Read More

March 14, 2016

Awakening from Data’s Dark Ages

Welcome to the Era of the Information Renaissance

Netscape Navigator was launched in late 1994 and in less than 8,000 days since it came on the scene we’ve witnessed an unbelievable explosion of access to and consumption of information at speeds unheard of in the entire evolution of humankind. I was reflecting on this just recently after my seven year old,... Read More

March 9, 2016

Winning with accurate and actionable information in Banking

The Situation

In todays information-driven economy nowhere is clean, connected and trustworthy information more vital than in banking where information is the lifeblood of the business. In banking accurate, timely and actionable information is the difference between market leaders and also-ran. The banking business model is based on the concept of leverage. Banks raise capital through deposits, borrowings and sale of... Read More

March 3, 2016

Our Winter ’15…our the little black dress

In November 2015, we rolled out our Winter ’15 release. As our customers have gotten their hands on it, their feedback has inspired me to write my own perspective.

Personally, I think we may have named the release wrong. Instead of “Winter ’15,” we should have called it: Enterprise-Grade Platform, the Little Black Dress (also known as the “LBD”).  Why, you... Read More

December 10, 2015

Paxata helps CPG companies optimize supply chains and plug revenue leaks

Ever made a supermarket run only to find out that the brand of beer you were looking for is out of stock. If you are like most consumers you’ll sulk, pick another brand and happily go home. After all, according to Deloitte’s 2015 American Pantry study, for the consumer packaged goods (CPG) category, brand loyalty continues a steady decline... Read More

October 20, 2015

Painted, Party People of #Data15

Here I stand at the Tableau Customer Conference (TC15, #data15)…watching all walks of data come together. I’ve met data scientists, pricing experts, business analysts, research analysts, Tableau administrators, and someone whose title is just “chief.” It’s fun to see next to a face painting booth, three piece suits, high heels and converse sneakers. While many have used Tableau, some... Read More

October 19, 2015

No rotten tomatoes here!

Paxata’s Adaptive Data Preparation Platform has become so entrenched in our customers’ data analysis processes that we’ve been asked to supercharge it: more data preparation, more often, no scripting necessary. So we responded by creating Paxata’s new self-service automation experience with an interactive and visual interface baked right into the application. Self-service automation puts the power of end-to-end data... Read More

October 19, 2015

I ♥ ClicktoPrep

Where data prep used to take you a hundred steps backward from your analysis, with ClicktoPrep, the answer in Tableau is just one click away.

Anyone tasked with a data-driven business decision knows that analysis is never a one-way path and you never have all the data you need when you start your analytic process. It’s iterative, letting analysts move... Read More

October 17, 2015

Happy Spreadsheet Day?!

Who knew? October 17th is “spreadsheet day.” Not sure if it’s a coincidence that yesterday was “Bosses Day” but I am sure thousands of business analysts around the world wish they could take a day off from spreadsheet work.

This article talks about the joy of navigating, charting and graphing with spreadsheets … stuff Excel is great for. What it doesn’t tell you... Read More

October 12, 2015

Coming Clean on Sessions Data for TC15

The other day, I was looking at the TC15 website getting ready for the big event and came across the TC15 sessions data that Tableau has provided for creating visualizations and dashboards. This is a cool way of getting people ramped up for the show; so I figured I’d take a look at the data in Paxata to see... Read More

September 29, 2015

Cisco Data Preparation Solution, Powered by Paxata

The blogosphere lit up this morning with news about Cisco Data Prep, and the team at Paxata could not be prouder of the fact that we are the underlying platform fueling Cisco’s entry into the self-service data preparation market. Wondering why Cisco? Let’s start with a great blog written by Kevin Ott, formerly the Senior Vice President, Product Development... Read More

June 24, 2015

Adaptive…what’s in a word?

Around our offices, we use the word “adaptive” to mean two distinct and important things, both being at the heart of everything we do.

From a business perspective, your ability to adapt is directly tied to your ability to answer questions that you may not have even realized you needed to ask two hours ago. In that context, Adaptive Data... Read More

June 22, 2015

Connecting the dots of human trafficking

Could better data sharing help identify human trafficking victims?

A few years back, I had an opportunity to work at the Human Smuggling and Trafficking Center on a data analysis project that resulted in a report for the White House Domestic Policy Council called the National Assessment on Human Trafficking. As a part of this project, my team and I... Read More

June 15, 2015

From a tiny spark, comes a flame

This week, Paxata is demonstrating our customer success with Apache Spark at the Spark Summit in San Francisco. I was not around when the Industrial Revolution was taking place (and please hold all jokes about my age, thanks) but it sure seems like the Hadoop Revolution is underway. In fact, I will be bold and say that I have not... Read More

May 21, 2015

You Down With IoT?

If we thought we had big data issues before, things are about to get really crazy. While companies like Cisco and Intel are figuring out how to make “things” connected, smart and even more useful than they are today, thosethings are also going to produce a lot of data exhaust. Just because a toothbrush can collect data about a... Read More

May 21, 2015

Elastic cloud: no such thing as too much

In a recent blog post by one of our fine engineers, Rishi Tirumala, he referenced a phrase that marketing uses to describe the elastic cloud.

Since I am the culprit of that message, I thought I would explain a little more about how I see it all. Yes, I get that it’s a simplified statement to say “add or remove... Read More

May 21, 2015

The Queen Bee Of Marketing (And Data)

It’s time for another #PaxChat and the topic is near and dear to my heart: The data driven marketing organization!

Well, the prediction Prakash made was that, in 2015 the Chief Marketing Officer will hold the power (and biggest budget) for the big data strategy. I could not agree more, and here’s why: conservatively, a marketing team runs on 15-20 tools.... Read More

May 20, 2015

Data Quality? That’s not my department….

While the image I selected might imply that I think big data is part of the problem, the truth is data quality has been a challenge for business people like me forever.

I am not exaggerating when I say that we had duplicates, misspellings, blanks, inconsistent data even when we were given tiny but highly-structured data sets that would occasionally... Read More

May 20, 2015

Elastic is a great paradigm when “all you can eat” is the goal

People are not reducing their data consumption these days. If anything, they want more – more data, from more sources, with more diversity than we ever imagined. As we pioneer a world where data can be digested easily, we have to engineer every part of our solution to expand as customers demand it. It’s a great time to be... Read More

May 5, 2015


If you missed the last #PaxChat, it was a fun one. The topic was about Excel, and here’s my perspective: Excel is a fine tool for reporting, creating graphs and charts, especially if the single data set you are working on is already in Excel. However, like many analysts, I have tried to make Excel do unholy acts of... Read More

April 20, 2015

To Hadoop or Not to Hadoop

We have had a lot of fun picking apart our CEO’s 2015 predictions on our live #PaxChat tweet chat…and the next topic is going to be a fun one. Unless you have been trapped under an elephant’s butt, you know that everyone in the data management world has contemplated a Hadoop data strategy. Whether they are three years into it or... Read More

April 10, 2015

Diva love at first sight

You know how you meet someone and INSTANTLY feel like you are going to be friends forever? That is how I felt when I first met Kasey Dixon. We call her the DC Diva since she heads up the pre-sales group for our Federal team. She joined Paxata about a month ago and I knew we needed to get... Read More

March 30, 2015

Top ten reasons I love Paxata

Our head of Talent sent out a note yesterday asking everyone to share one reason why they love Paxata. I started to think about THE ONE thing and realized I couldn’t choose.

So…here are my top ten reasons why Paxata rocks!

The key to my heart is good grub. I literally roll out of the office door and find myself in... Read More
March 29, 2015

The old dog with new tricks

Yes, I am the old dog. I am on Google Chat and Hangouts, Snapchat, Facebook, LinkedIn, Twitter, Flipboard. So why did I need to participate in Tweet chats???? Do I really have more words to offer than places to put them?

Three weeks ago, the Divas did our first #PaxChat and it turned out to be a blast. We have... Read More

March 20, 2015

Are you making a real difference?

The software job market is HOT and software professionals have more opportunities than ever. But that doesn’t make things easier.

For people on either side of the hiring game, having too many choices actually makes the selection process harder. Good software engineers have many options, so how do they choose the opportunity that best fits their needs? For hiring managers,... Read More

March 19, 2015

Under Construction goes live!

Quite frequently, someone on Paxata’s engineering team will have a bee in their bonnet, a soapbox worth standing on or a contrary opinion worth talking about. As new members of the team come on board, they might even share an insider’s look at their first 90 days or the types of projects they are getting to take on.

“Under Construction”... Read More

February 24, 2015

“V” for Victory?

No – this time the “V” is for “Validation!” Sweet validation of the category we have worked really hard to create. After several years of what Jerry Maguire called “up-at-dawn, pride-swallowing siege,” the team at Paxata is thrilled to read the recent Forrester Research report entitled “Data Preparation Tools Accelerate Analytics.” Frankly, Forrester could have just published the title... Read More

February 24, 2015

Think ahead or get left behind

I love Gartner’s point about this in a recent research piece. Okay, they didn’t use those exact words but you get the point. So often, we focus on collecting and storing all this data without really thinking through how we are going to make it useful. I know everyone talks about “big data” but Gartner makes the point that... Read More

January 13, 2015

We Are Not “Yes” Divas…We are Data Divas

If you enjoyed what our Diva In Charge, Cari Jaquet, said about Prakash’s BI predictions that were published in Forbes, you won’t want to miss this new video series. Our newest Data Diva Julie Mayhew and I invited Joseph di Paolantonio, a big thinker on big data, to give OUR perspective on Prakash’s predictions.

Cari’s opinions are MILD compared to the picking... Read More

January 9, 2015

Unpredictable Data Diva Takes On CEO Predictions

A Diva With a Mind of Her Own…

No one has a crystal ball but you would think I had a pair of them for giving my honest opinion about Prakash Nanduri’s BI predictions published in Forbes.

In these quick segments, I decompose each of his six predictions and offer my take on whether I think Prakash is on the money... Read More

November 25, 2014

2014 predictions – How did we do?

As 2015 comes speeding around the corner, you can expect a slew of predictions to start surfacing…and the Paxata fortune tellers are busy on a few! I thought it would be fun to take a look-see at Paxata 2014 predictions Prakash published in Forbes and grade him on his ability to see into the future. I mean, how often... Read More

August 22, 2014

Paxata Data Reshaping FTW

I came across an excellent data reshaping use case with a prospect’s sample data set today and wanted to share my findings (and boost morale on a Monday).

Some business analysts are aware of macro code or Excel add-ins (Tableau provides a data reshaping add-in as a free download) which are available to unpivot (aka “flatten”) wide data sets to... Read More

July 10, 2014

Props to the King of Grok

“Grok” was today’s “word of the day,” and I immediately started humming “the King of Grok, that is my name…” (think: the Beastie Boys tune “Paul Revere”). It took me back to a year ago, when I was first introduced to Joseph di Paolantonio and his partner, Clarise Z. Doval Santos. We were meeting with a lot of industry analysts and... Read More

June 24, 2014

How to Choose a Self-Service Data Preparation Platform

No matter the lipstick shade…it’s still a pig.

That’s right. I wanted to write a piece about the next generation data preparation solution – and why it was critical for Paxata to innovate the Adaptive Data Preparation platform from the ground-up. Then, I realized the more interesting story is about the vendors who are doing just the opposite. I am... Read More

June 24, 2014

Data science or rocket science?

No question in my mind: people who use data to get smarter about their business may as well be astronauts because it’s not a trivial undertaking. More and more, people who thrive on data are getting the support of the organization around them. The Data Divas thought it would be fun to talk about what data-driven organizations (DDOs) look... Read More

June 20, 2014

What do Data and Dating have in common?

Of course, the knee-jerk answer to that question would be “it’s a numbers game.” But really, I have slightly better material than that…

The last time I was single, dinosaurs were still roaming the earth so, needless to say, entering today’s strange world of dating has been overwhelming – the variety, velocity and volume and ugh – veracity – of... Read More

June 16, 2014

Data Validation Without Conditional Formulas or VLOOKUPs

Here’s another little data validation nugget for you…

Sometimes, you are working with five or six or ten different spreadsheets and you need to harmonize a set of values across all of them. For example, let’s take states in North America. One spreadsheet might have states as two letter abbreviates, while another has them spelled out. In Excel, you need... Read More

June 16, 2014

Merging Multiple Data Sets or Excel Spreadsheets

Okay – here’s one that we just wish was not painful in Excel…but it is. Sometimes, you need to combine two spreadsheets – and everyone has a different term for this painful step in the data preparation process: combine, join, merge, mash up, wrangle, overlay, blend…you get the picture. And you have to do this either because you want... Read More

June 4, 2014

Quick & Easy Data Enrichment with 3rd Party Data

I’ve covered the enrichment topic in one of my Data Diva blogs – the requirement business analytic teams have to supplement their core system data with data that we buy or rent from third parties. Sometimes it’s the Census data, sometimes it’s Dun & Bradstreet, sometimes it is Nielsen information…and it’s all sitting in spreadsheets. I had to vent... Read More

May 27, 2014

New Data Exploration Tools … Goodbye Excel

Sometimes I work with data that I am not familiar with – either it is unstructured or came from a source that does not follow our naming conventions or data standards. In Excel, I would spend hours staring into the abyss, looking at data sets, columns and values just trying to locate things that were going to screw up... Read More

May 24, 2014

Hadoop … that is one big yellow elephant

The Divas recently interviewed our colleague and good friend, Rebecca Wong and talked about Hadoop. For those of you that don’t know the big yellow elephant that has taken over the data room, meet Hadoop. It has completely disrupted the data management world. Hadoop makes it possible to collect and store more raw data than we ever imagined, at... Read More

April 21, 2014

The Trouble with Big Data: Data Veracity, Data Preparation

The Divas recently “interviewed” Joseph di Paolantonio, Principal Analyst of Data Archon and overall cool guy. The topic was around decisions being made with big data, and the serious pitfalls that happen when data is either not clean or complete.

The webcast started with how we have spent the last twenty years coming up with better ways to handle data,... Read More

April 12, 2014

Marketing People Aren’t Dummies … We Have Disparate Data Disease

Back in January 2014, Scott Brinker posted this Marketing Technology Landscape to his blog on If anyone ever wonders why Marketing people have a hard time being metrics-driven, let’s take a quick look at this supergraphic. My head hurts when I think about the massive companies who invest in three or four tools in each of these categories… and... Read More

April 12, 2014

Awakening the Data Divas

A few months ago, I was having a lively conversation with Lilia Gutnik, our Director of Product Marketing. It was around the idea that we both came to Paxata with unique experiences – specifically around data. I am the classic marketing person who has to constantly rely on data to justify my budget (and sometimes, my very existence) while Lilia had... Read More

Show Buttons
Hide Buttons