Skip to contents

musicobservatoryutils provides small, self-contained utility functions for music metadata processing and knowledge engineering workflows in the Open Music Observatory.

The package collects lightweight helpers for recurring ETL tasks such as identifier parsing, metadata normalization, and data repair. Functions are designed to minimize interdependencies and to serve as building blocks in larger, project-specific data pipelines rather than as a monolithic framework.

The focus is on transparent, conservative interpretation of metadata (e.g. ISRC parsing according to ISO 3901 and IFPI practices, or deterministic string normalization for identifier-style matching). The package does not assign authoritative identifiers and does not replace official standards bodies or registries.

Installation

You can install the development version of musicobservatoryutils from GitHub:

# install.packages("pak")
pak::pak("dataobservatory-eu/musicobservatoryutils")

Working with ISRC codes

A common task in music metadata pipelines is extracting structured information from International Standard Recording Codes (ISRCs):

library(musicobservatoryutils)
isrc_codes <- c("QZMEM2001409", "USA370575071", "NOUM70600224")

You can resolve the allocating authority and, where applicable, the corresponding ISO country code:

isrc_resolve_registrar(isrc_codes)
#> # A tibble: 3 × 4
#>   isrc         isrc_country_code isrc_registrar country_code
#>   <chr>        <chr>             <chr>          <chr>       
#> 1 QZMEM2001409 QZ                United States  US          
#> 2 USA370575071 US                United States  US          
#> 3 NOUM70600224 NO                Norway         NO

You can also extract the ISRC registration year using a conservative, standards-aware interpretation:

#> [1] 2020 2005 2006

Normalizing strings for identifier-style matching

The package includes helpers for deterministic, ASCII-based normalization of character strings for matching and reconciliation purposes (e.g. IPI-style name matching):

normalize_for_ipi("Седой Урал|Björk Guðmundsdóttir")
#> [1] "SEDOY URAL|BJORK GUDMUNDSDOTTIR"