Journal Article PUBDB-2025-04384

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
HarmonizR: blocking and singular feature data adjustment improve runtime efficiency and data preservation

 ;  ;

2025
BioMed Central London

BMC bioinformatics 26(1), 47 () [10.1186/s12859-025-06073-9]
 GO

This record in other databases:    

Please use a persistent id in citations: doi:  doi:

Abstract: Data adjustment is an essential tool for increasing statistical power during analysis, for example in case of complex multi-experiment data from (single-cell) RNA, proteomics and other omics data. Despite its benefits, data integration introduces internal biases—so-called batch effects. Due to the inherent presence of missing values by such methods and their additional introduction by means of data integration, renowned algorithms such as ComBat and limma are unable to perform batch effect adjustment. Recently, the HarmonizR framework was presented for these cases, which is a tool for missing value tolerant data adjustment.In this contribution, we provide significant improvements to the HarmonizR approach. A novel blocking strategy is introduced to severely reduce runtime, while still supporting parallel architectures. Additionally, a “unique removal” strategy has been integrated into HarmonizR to maintain even more features for adjustment in datasets, showing a feature rescue of up to 103.9% for our tested datasets. In this work, we show (1) severely improved runtime for both small and large, real datasets and (2) the ability retain more features from the integrated dataset during adjustment, showing a feature rescue of up to 103.9% for our tested datasets.The proposed improvements tackle the previous shortcomings of the published HarmonizR version. Since HarmonizR was mainly developed for dataset integration on rare tumor entities, it did not include runtime improvements beyond parallelization, which has been addressed in this update. An additionally welcome update regarding improved feature rescue furthermore enhances the algorithms ability to quickly and robustly perform batch effect reduction.

Classification:

Contributing Institute(s):
  1. Informationstechnologie (IT)
Research Program(s):
  1. 623 - Data Management and Analysis (POF4-623) (POF4-623)
Experiment(s):
  1. No specific instrument

Appears in the scientific report 2025
Database coverage:
Medline ; Creative Commons Attribution CC BY 4.0 ; DOAJ ; OpenAccess ; Article Processing Charges ; BIOSIS Previews ; Biological Abstracts ; Clarivate Analytics Master Journal List ; DOAJ Seal ; Ebsco Academic Search ; Essential Science Indicators ; Fees ; IF < 5 ; JCR ; SCOPUS ; Science Citation Index Expanded ; Web of Science Core Collection
Click to display QR Code for this record

The record appears in these collections:
Private Collections > >DESY > >FH > >IT > IT
Document types > Articles > Journal Article
Public records
Publications database
OpenAccess

 Record created 2025-10-15, last modified 2025-10-19


OpenAccess:
Download fulltext PDF Download fulltext PDF (PDFA)
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)