Package: dataCompareR 0.1.4

Sarah Johnston

dataCompareR: Compare Two Data Frames and Summarise the Difference

Easy comparison of two tabular data objects in R. Specifically designed to show differences between two sets of data in a useful way that should make it easier to understand the differences, and if necessary, help you work out how to remedy them. Aims to offer a more useful output than all.equal() when your two data sets do not match, but isn't intended to replace all.equal() as a way to test for equality.

Authors:Sarah Johnston [aut, cre], Rob Noble-Eddy [aut], Merlijn van Horssen [aut], Krishan Bhasin [aut], Sarah Pollicott [aut], Lukas Drapal [ctb], Nikolaos Perrakis [ctb], Nikhil Thomas Joy [ctb], Shahriar Asta [ctb], Karandeep Lidher [ctb], Dan Kellett [ctb], Kevin Chisholm [ctb], Laura Joy [ctb], Fergus Wadsley [ctb], Heather Hackett [ctb], David Robinson [ctb], Cheryl Renton [ctb], Matt Triggs [ctb], Carola Deppe [ctb], Ruijing Li [ctb], John Swift [ctb], Capital One Services, LLC [cph]

dataCompareR_0.1.4.tar.gz
dataCompareR_0.1.4.zip(r-4.5)dataCompareR_0.1.4.zip(r-4.4)dataCompareR_0.1.4.zip(r-4.3)
dataCompareR_0.1.4.tgz(r-4.4-any)dataCompareR_0.1.4.tgz(r-4.3-any)
dataCompareR_0.1.4.tar.gz(r-4.5-noble)dataCompareR_0.1.4.tar.gz(r-4.4-noble)
dataCompareR_0.1.4.tgz(r-4.4-emscripten)dataCompareR_0.1.4.tgz(r-4.3-emscripten)
dataCompareR.pdf |dataCompareR.html
dataCompareR/json (API)
NEWS

# Install 'dataCompareR' in R:
install.packages('dataCompareR', repos = c('https://capitalone.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/capitalone/datacomparer/issues

On CRAN:

compare-datadatadata-analysisdata-science

7.14 score 75 stars 74 scripts 465 downloads 3 exports 24 dependencies

Last updated 1 years agofrom:3153e5cd8d. Checks:OK: 3 NOTE: 4. Indexed: yes.

TargetResultDate
Doc / VignettesOKNov 12 2024
R-4.5-winNOTENov 12 2024
R-4.5-linuxNOTENov 12 2024
R-4.4-winNOTENov 12 2024
R-4.4-macNOTENov 12 2024
R-4.3-winOKNov 12 2024
R-4.3-macOKNov 12 2024

Exports:generateMismatchDatarComparesaveReport

Dependencies:clicommonmarkdplyrevaluatefansigenericsgluehighrknitrlifecyclemagrittrmarkdownpillarpkgconfigR6rlangstringitibbletidyselectutf8vctrswithrxfunyaml

dataCompareR vignette

Rendered fromdataCompareR.Rmdusingknitr::rmarkdownon Nov 12 2024.

Last update: 2022-05-06
Started: 2017-06-07

Readme and manuals

Help Manual

Help pageTopics
allVarMatchMessageallVarMatchMessage
checkEmptycheckEmpty
checkForRcompareColcheckForRCompareCol
checkKeysExistcheckKeysExist
CheckNAcheckNA
Checks that a list of indexes areuniquecheckUniqueness
cleanColNames : get colnames, remove leading and trailing whitespace and push to upper casecleanColNames
coerceDatacoerceData
coerceFactorsToChar: convert all factor type fields to characterscoerceFactorsToChar
collapseClasses. Collapse the classes of an object to a single stringcollapseClasses
colsWithUnequalValues: a dataframe summarising a column with unequal valuescolsWithUnequalValues
Compare data. Wrapper for comparison functionality.compareData
compareNames : compare the intersect of colInfoA and colInfoB and return boolean of matched columns for each data framecompareNames
Create a dataframe of the rows that don't matchcreateAntiSubset
Converts cleaning info into a format consumable by updateCompareObject.createCleaningInfo
Converts the output of the column matching logic to something consumable by updateCompareObject.createColMatching
Generates an empty list of the correct class to store resultscreateCompareObject
Takes the raw info for the meta block of the output and puts it in a format usable by the updateCompareObject functioncreateMeta
Create mismatch objectcreateMismatches
Create mismatch objectcreateMismatchObject
createReportText: prepares text which is used in the summary report Saves R markdown and HTML reports in the area specified by the user. Reports are called RcompareReport.Rmd (.html) Uses knitr package to create tables in the markdown (createReportText function) and HTML report.createReportText
function for updating a compare object with information passed to it from the match rows functioncreateRowMatching
createTextSummary: create a text based summary of an dataCompareR objectcreateTextSummary
Place to store and access the current object version.currentObjVersion
executeCoercions:executeCoercions
Extract data from a dataCompareR comparisongenerateMismatchData
Subsets on the variables that have a coercion.getCoercions
Extracts the column names only in one data frame from a table of match informationgetMismatchColNames
Check object is of class dataCompareRobjectis.dataCompareRobject
isNotNull: is object not nullisNotNull
isSingleNAisSingleNA
listObsNotVerboselistObsNotVerbose
listObsVerboselistObsVerbose
Checks whether elements in two input data frames are equal.locateMismatches
makeValidKeysmakeValidKeys
makeValidNamesmakeValidNames
matchColumns : create subset of DFA and DFB to contain matching column names for both data framesmatchColumns
Generate two dataframes that contain the same rows based on a two-column indexmatchMultiIndex
Generate two dataframes that contain the same rows based on a two-column indexmatchNoIndex
Generate two dataframes and returns subsets of these dataframes that have shared rows.matchRows
Generate two dataframes that contain the same rows based on a single indexmatchSingleIndex
Creates a list of info about the dataframe.metaDataInfo
mismatchHighStop Checks if we've exceeded threshold of mismatchesmismatchHighStop
orderColumns: order columns by treated column namesorderColumns
outputSectionHeader: creates an outputSectionHeaderoutputSectionHeader
prepareData Prepares data for comparison in 3 stages. 1. Match columns - filter dataframes to those columns that match and summarise differences 2. Match rows - filter dataframes to those rows that match and summarise differences 3. Coerce dataprepareData
Printing RCompare Outputprint.dataCompareRobject
Printing summaryRCompare Outputprint.summary.dataCompareRobject
processFlow Handles the process flow for the whole packageprocessFlow
Compare two data framesrCompare
rcompObjItemLength: return length of an item, returning 0 if null, and handling the fact that we might have a data frames or a vectorrcompObjItemLength
Round all numeric fields in a data framerounddf
Save a report based on a dataCompareR objectsaveReport
subsetDataColumns : create subset of DFA and DFB to contain matching column names for both data framessubsetDataColumns
Summarizing RCompare Outputsummary.dataCompareRobject
trimCharVars: trim white spaces in character variables from an input dataframetrimCharVars
Generic function for updating a compare object with information passed to it, that has methods based on the class of the info argument.updateCompareObject
Updates cleaning info in the compare objectupdateCompareObject.cleaninginfo
Adds a colMatching block to the outputupdateCompareObject.colmatching
Adds a colMatching block to the outputupdateCompareObject.matches
Takes raw info for meta and adds it to the compare objectupdateCompareObject.meta
Adds a colMatching block to the outputupdateCompareObject.mismatches
Adds a rowMatching block to the outputupdateCompareObject.rowmatching
validateArgumentsvalidateArguments
validateData : routine to validate the input datavalidateData
Create variable mismatch detailsvariableDetails
Create variable mismatch tablevariableMismatches
Warn users if the calculation is likely to be slowwarnLargeData