This is the Linux app named data-diff whose latest release can be downloaded as v0.9.7sourcecode.zip. It can be run online in the free hosting provider OnWorks for workstations.
Download and run online this app named data-diff with OnWorks for free.
Follow these instructions in order to run this app:
- 1. Downloaded this application in your PC.
- 2. Enter in our file manager https://www.onworks.net/myfiles.php?username=XXXXX with the username that you want.
- 3. Upload this application in such filemanager.
- 4. Start the OnWorks Linux online or Windows online emulator or MACOS online emulator from this website.
- 5. From the OnWorks Linux OS you have just started, goto our file manager https://www.onworks.net/myfiles.php?username=XXXXX with the username that you want.
- 6. Download the application, install it and run it.
SCREENSHOTS
Ad
data-diff
DESCRIPTION
We're excited to announce the launch of a new open-source product, data-diff that makes comparing datasets across databases fast at any scale. data-diff automates data quality checks for data replication and migration. In modern data platforms, data is constantly moving between systems, and at the modern data volume and complexity, systems go out of sync all the time. Until now, there has not been any tooling to ensure that when the data is correctly copied. Replicating data at scale, across hundreds of tables, with low latency and at a reasonable infrastructure cost is a hard problem, and most data teams we’ve talked to, have faced data quality issues in their replication processes. The hard truth is that the quality of the replication is the quality of the data. Since copying entire datasets in batch is often infeasible at the modern data scale, businesses rely on the Change Data Capture (CDC) approach of replicating data using a continuous stream of updates.
Features
- Find mismatches across databases
- Outputs diff of rows in detail
- Simple CLI/API to create monitoring and alerts
- Verify 25M+ rows in <10s, and 1B+ rows in ~5min
- Verifies across many different databases
- Works for tables with 10s of billions of rows
Programming Language
Python
Categories
This is an application that can also be fetched from https://sourceforge.net/projects/data-diff.mirror/. It has been hosted in OnWorks in order to be run online in an easiest way from one of our free Operative Systems.