VALUE: Profiling and filtering dates in Paxata can provide a simple and fast method to understand the date/time characteristics of your data (such as trends, outliers, and unexpected values) which may need resolving prior to pushing the data into production. It’s a great way to validate you have been provided an accurate & complete set of data from upstream or third-party... Read More
Often data files have variable record types, identified by a control character in the first one or two positions of the record. This Tip of the Day walks you through the basic steps of importing and parsing the data.
Assume your data file looks like this:
The first character of each row identifies what type of row it is. Some examples:
VALUE: For transactional and other low-level data, Paxata provides a method for flagging or marking individual rows which meet a given criteria or business rule. Working on complete (non-sampled) data, provides users the ability to review, validate, isolate, and output results. This approach allows you to operate within an spreadsheet paradigm while you maintain focus on the full scope of data.
In... Read More
Often we need to remove rows from our Project, and that is very straightforward. There is a simple best practice around removing rows of which you should be aware.
The steps for removing rows are as follows: Using a Filtergram, isolate the rows which you want to remove. From the left-hand toolbar, click the Scissors icon (the step editor will now have... Read More
Importing Data from a flat file into Paxata is very straightforward. Often, the default import settings work perfectly. Sometimes a user may need to adjust a couple settings.
Today’s Tip of the Day focuses on handling flat files with leading zeros. Paxata will not remove leading zeros during import, yet it is important to know how leading zeros are handled... Read More
By: Kumar JayaramData in a multi-cloud hybrid world
Traditional data analytics environments were built on the Enterprise Data Warehouse (EDW) usually running in some relational database like Teradata, Microsoft, or Oracle. While those still exist, there are many more sources that analysts need to get access to in order to perform the kinds of analytics required.Big data/Hadoop: increasingly, the modern... Read More
Using “Auto Number” provides a unique index number for every row in your dataset. You can then bind the column to the sort order for other, existing columns in your dataset. Auto Number is useful when you need to: 1) track your dataset’s original order 2) assign row ID’s to your dataset 3) identify sections or groups of rows in... Read More
Isolating rows based on compound filters helps you zoom on in specific areas of investigation. This PaxTip of the Day shows you a technique to filter rows in a project based on a compound set of values. Simple & effective.
Assume you need to isolate all rows where ColumnA has the values “INVESTMENT” and “FUNDS” and ColumnB has the values “ACTIVE”
Step... Read More
VALUE: When presenting your analysis you may hear: “How did you come up with this result?” or “Why does the data look like this?” or the dreaded “Your numbers don’t reconcile with mine…”. Paxata’s ClicktoPrep feature allows you to answer such questions quickly.This tip is part of series which demonstrates the built-in data lineage features of Paxata. In this... Read More
Analysts frequently need to handle numeric outliers in their data. With the Filtergram feature in Paxata, isolating outliers is easy.
Follow these steps to isolate the numeric outliers in your dataset using a filtergram:
Step 1: Open a Filtergram on a numeric column by clicking on the data type option (See Image A)
Image A – A numeric Filtergram with a column... Read More