| Home > Publications database > HarmonizR: blocking and singular feature data adjustment improve runtime efficiency and data preservation > print | 
| 001 | 639283 | ||
| 005 | 20251019055719.0 | ||
| 024 | 7 | _ | |a 10.1186/s12859-025-06073-9 |2 doi | 
| 024 | 7 | _ | |a 1471-2105 |2 ISSN | 
| 024 | 7 | _ | |a 10.3204/PUBDB-2025-04384 |2 datacite_doi | 
| 024 | 7 | _ | |a altmetric:174083920 |2 altmetric | 
| 024 | 7 | _ | |a pmid:39934730 |2 pmid | 
| 037 | _ | _ | |a PUBDB-2025-04384 | 
| 041 | _ | _ | |a English | 
| 082 | _ | _ | |a 610 | 
| 100 | 1 | _ | |a Schlumbohm, Simon |0 P:(DE-HGF)0 |b 0 |e Corresponding author | 
| 245 | _ | _ | |a HarmonizR: blocking and singular feature data adjustment improve runtime efficiency and data preservation | 
| 260 | _ | _ | |a London |c 2025 |b BioMed Central | 
| 336 | 7 | _ | |a article |2 DRIVER | 
| 336 | 7 | _ | |a Output Types/Journal article |2 DataCite | 
| 336 | 7 | _ | |a Journal Article |b journal |m journal |0 PUB:(DE-HGF)16 |s 1760617968_2110634 |2 PUB:(DE-HGF) | 
| 336 | 7 | _ | |a ARTICLE |2 BibTeX | 
| 336 | 7 | _ | |a JOURNAL_ARTICLE |2 ORCID | 
| 336 | 7 | _ | |a Journal Article |0 0 |2 EndNote | 
| 520 | _ | _ | |a Data adjustment is an essential tool for increasing statistical power during analysis, for example in case of complex multi-experiment data from (single-cell) RNA, proteomics and other omics data. Despite its benefits, data integration introduces internal biases—so-called batch effects. Due to the inherent presence of missing values by such methods and their additional introduction by means of data integration, renowned algorithms such as ComBat and limma are unable to perform batch effect adjustment. Recently, the HarmonizR framework was presented for these cases, which is a tool for missing value tolerant data adjustment.In this contribution, we provide significant improvements to the HarmonizR approach. A novel blocking strategy is introduced to severely reduce runtime, while still supporting parallel architectures. Additionally, a “unique removal” strategy has been integrated into HarmonizR to maintain even more features for adjustment in datasets, showing a feature rescue of up to 103.9% for our tested datasets. In this work, we show (1) severely improved runtime for both small and large, real datasets and (2) the ability retain more features from the integrated dataset during adjustment, showing a feature rescue of up to 103.9% for our tested datasets.The proposed improvements tackle the previous shortcomings of the published HarmonizR version. Since HarmonizR was mainly developed for dataset integration on rare tumor entities, it did not include runtime improvements beyond parallelization, which has been addressed in this update. An additionally welcome update regarding improved feature rescue furthermore enhances the algorithms ability to quickly and robustly perform batch effect reduction. | 
| 536 | _ | _ | |a 623 - Data Management and Analysis (POF4-623) |0 G:(DE-HGF)POF4-623 |c POF4-623 |f POF IV |x 0 | 
| 588 | _ | _ | |a Dataset connected to CrossRef, Journals: bib-pubdb1.desy.de | 
| 693 | _ | _ | |0 EXP:(DE-MLZ)NOSPEC-20140101 |5 EXP:(DE-MLZ)NOSPEC-20140101 |e No specific instrument |x 0 | 
| 700 | 1 | _ | |a Neumann, Julia E. |b 1 | 
| 700 | 1 | _ | |a Neumann, Philipp |0 P:(DE-H253)PIP1106404 |b 2 |u desy | 
| 773 | _ | _ | |a 10.1186/s12859-025-06073-9 |g Vol. 26, no. 1, p. 47 |0 PERI:(DE-600)2041484-5 |n 1 |p 47 |t BMC bioinformatics |v 26 |y 2025 |x 1471-2105 | 
| 856 | 4 | _ | |y OpenAccess |u https://bib-pubdb1.desy.de/record/639283/files/s12859-025-06073-9.pdf | 
| 856 | 4 | _ | |y OpenAccess |x pdfa |u https://bib-pubdb1.desy.de/record/639283/files/s12859-025-06073-9.pdf?subformat=pdfa | 
| 909 | C | O | |o oai:bib-pubdb1.desy.de:639283 |p openaire |p open_access |p VDB |p driver |p dnbdelivery | 
| 910 | 1 | _ | |a Deutsches Elektronen-Synchrotron |0 I:(DE-588b)2008985-5 |k DESY |b 2 |6 P:(DE-H253)PIP1106404 | 
| 910 | 1 | _ | |a External Institute |0 I:(DE-HGF)0 |k Extern |b 2 |6 P:(DE-H253)PIP1106404 | 
| 913 | 1 | _ | |a DE-HGF |b Forschungsbereich Materie |l Materie und Technologie |1 G:(DE-HGF)POF4-620 |0 G:(DE-HGF)POF4-623 |3 G:(DE-HGF)POF4 |2 G:(DE-HGF)POF4-600 |4 G:(DE-HGF)POF |v Data Management and Analysis |x 0 | 
| 914 | 1 | _ | |y 2025 | 
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0200 |2 StatID |b SCOPUS |d 2025-01-01 | 
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0160 |2 StatID |b Essential Science Indicators |d 2025-01-01 | 
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)1050 |2 StatID |b BIOSIS Previews |d 2025-01-01 | 
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)1190 |2 StatID |b Biological Abstracts |d 2025-01-01 | 
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0600 |2 StatID |b Ebsco Academic Search |d 2025-01-01 | 
| 915 | _ | _ | |a JCR |0 StatID:(DE-HGF)0100 |2 StatID |b BMC BIOINFORMATICS : 2022 |d 2025-01-01 | 
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0501 |2 StatID |b DOAJ Seal |d 2024-04-10T15:34:04Z | 
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0500 |2 StatID |b DOAJ |d 2024-04-10T15:34:04Z | 
| 915 | _ | _ | |a WoS |0 StatID:(DE-HGF)0113 |2 StatID |b Science Citation Index Expanded |d 2025-01-01 | 
| 915 | _ | _ | |a Fees |0 StatID:(DE-HGF)0700 |2 StatID |d 2025-01-01 | 
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0150 |2 StatID |b Web of Science Core Collection |d 2025-01-01 | 
| 915 | _ | _ | |a IF < 5 |0 StatID:(DE-HGF)9900 |2 StatID |d 2025-01-01 | 
| 915 | _ | _ | |a OpenAccess |0 StatID:(DE-HGF)0510 |2 StatID | 
| 915 | _ | _ | |a Peer Review |0 StatID:(DE-HGF)0030 |2 StatID |b ASC |d 2025-01-01 | 
| 915 | _ | _ | |a Article Processing Charges |0 StatID:(DE-HGF)0561 |2 StatID |d 2025-01-01 | 
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0300 |2 StatID |b Medline |d 2025-01-01 | 
| 915 | _ | _ | |a Creative Commons Attribution CC BY 4.0 |0 LIC:(DE-HGF)CCBY4 |2 HGFVOC | 
| 915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0199 |2 StatID |b Clarivate Analytics Master Journal List |d 2025-01-01 | 
| 920 | 1 | _ | |0 I:(DE-H253)IT-20120731 |k IT |l Informationstechnologie |x 0 | 
| 980 | _ | _ | |a journal | 
| 980 | _ | _ | |a VDB | 
| 980 | _ | _ | |a UNRESTRICTED | 
| 980 | _ | _ | |a I:(DE-H253)IT-20120731 | 
| 980 | 1 | _ | |a FullTexts | 
| Library | Collection | CLSMajor | CLSMinor | Language | Author | 
|---|