Skip to contents

Advanced Path Resolution

Sometimes data moves between environments during development, or you need to check multiple locations for files. This guide shows you how to set up dynamic path resolution that adapts to your workflow.

The Problem: Moving Data

Imagine this scenario with our friend Tidy McVerse:

  1. She starts programming with data in development: /demo/DEV/username/project1/data
  2. Halfway through, the data becomes production-ready and moves to: /demo/PROD/project1/data
  3. Her code should work without changes, regardless of where the data lives

Solution: Multiple Path Locations

Configure paths as lists with multiple possible locations:

default:
  paths:
    data: !expr list(DEV = '/demo/DEV/username/project1/data', PROD = '/demo/PROD/project1/data')
    output: '/demo/DEV/username/project1/output'
    programs: '/demo/DEV/username/project1/programs'
    envsetup_environ: !expr Sys.setenv(ENVSETUP_ENVIRON = 'DEV'); 'DEV'

Working Example Setup

library(envsetup)

# Create temporary directory structure
dir <- fs::file_temp()
dir.create(dir)
config_path <- file.path(dir, "_envsetup.yml")

# Write configuration with multiple data paths
file_conn <- file(config_path)
writeLines(
  paste0(
"default:
  paths:
    data: !expr list(DEV = '", dir,"/demo/DEV/username/project1/data', PROD = '", dir, "/demo/PROD/project1/data')
    output: '", dir, "/demo/DEV/username/project1/output'
    programs: '", dir, "/demo/DEV/username/project1/programs'
    envsetup_environ: !expr Sys.setenv(ENVSETUP_ENVIRON = 'DEV'); 'DEV'"
 ), file_conn)
close(file_conn)

# Load and apply configuration
envsetup_config <- config::get(file = config_path)
rprofile(envsetup_config)
#> Assigned paths to __callr_data__Assigned paths to R_GlobalEnv

Understanding the Configuration

Let’s examine what we now have available:

# See all configured objects
ls(envsetup_environment)
#> character(0)

# Data is now a named list with multiple locations
get_path(data)
#> $DEV
#> [1] "/tmp/RtmpBB8Rks/file1e3d58cd99c1/demo/DEV/username/project1/data"
#> 
#> $PROD
#> [1] "/tmp/RtmpBB8Rks/file1e3d58cd99c1/demo/PROD/project1/data"
get_path(output)
#> [1] "/tmp/RtmpBB8Rks/file1e3d58cd99c1/demo/DEV/username/project1/output"
get_path(programs)
#> [1] "/tmp/RtmpBB8Rks/file1e3d58cd99c1/demo/DEV/username/project1/programs"
get_path(envsetup_environ)
#> [1] "DEV"

Using read_path() for Smart File Location

The read_path() function searches through your path list to find files:

# Create the directory structure
dir.create(file.path(dir, "/demo/DEV/username/project1/data"), recursive = TRUE)
dir.create(file.path(dir, "/demo/PROD/project1/data"), recursive = TRUE)

# Add data only to PROD location
saveRDS(mtcars, file.path(dir, "/demo/PROD/project1/data/mtcars.RDS"))

# read_path() finds the file in PROD
read_path(data, "mtcars.RDS")
#> Read Path:/tmp/RtmpBB8Rks/file1e3d58cd99c1/demo/PROD/project1/data/mtcars.RDS
#> [1] "/tmp/RtmpBB8Rks/file1e3d58cd99c1/demo/PROD/project1/data/mtcars.RDS"

Path Search Order

When data exists in multiple locations, read_path() follows the search order:

# Add the same data to DEV location
saveRDS(mtcars, file.path(dir, "/demo/DEV/username/project1/data/mtcars.RDS"))

# Now read_path() returns DEV location (first in search order)
read_path(data, "mtcars.RDS")
#> Read Path:/tmp/RtmpBB8Rks/file1e3d58cd99c1/demo/DEV/username/project1/data/mtcars.RDS
#> [1] "/tmp/RtmpBB8Rks/file1e3d58cd99c1/demo/DEV/username/project1/data/mtcars.RDS"

Controlling Search Order with envsetup_environ

The envsetup_environ variable controls which paths are searched:

  • DEV: Searches DEV first, then PROD
  • PROD: Searches only PROD (skips DEV)

Environment-Specific Path Resolution

Let’s add a production configuration that changes the search behavior:

# Update config to include prod environment
file_conn <- file(config_path)
writeLines(
  paste0(
"default:
  paths:
    data: !expr list(DEV = '",dir,"/demo/DEV/username/project1/data', PROD = '",dir,"/demo/PROD/project1/data')
    output: '",dir,"/demo/DEV/username/project1/output'
    programs: '",dir,"/demo/DEV/username/project1/programs'
    envsetup_environ: !expr Sys.setenv(ENVSETUP_ENVIRON = 'DEV'); 'DEV'

prod:
  paths:
    envsetup_environ: !expr Sys.setenv(ENVSETUP_ENVIRON = 'PROD'); 'PROD'"
  ), file_conn)
close(file_conn)

# Load production configuration
envsetup_config <- config::get(file = config_path, config = "prod")
rprofile(envsetup_config)
#> Assigned paths to __callr_data__Assigned paths to R_GlobalEnv

# Check the environment setting
get_path(envsetup_environ)
#> [1] "PROD"

Production Path Resolution

With the production configuration, path resolution behavior changes:

# In production, read_path() returns PROD location even though DEV exists
read_path(data, "mtcars.RDS")
#> Read Path:/tmp/RtmpBB8Rks/file1e3d58cd99c1/demo/PROD/project1/data/mtcars.RDS
#> [1] "/tmp/RtmpBB8Rks/file1e3d58cd99c1/demo/PROD/project1/data/mtcars.RDS"

Practical Usage Pattern

Here’s how you’d typically use this in your code:

# Instead of hardcoding paths:
# my_data <- readRDS("/some/hardcoded/path/mtcars.RDS")

# Use dynamic path resolution:
data_path <- read_path(data, "mtcars.RDS")
my_data <- readRDS(data_path)

# This works regardless of environment or data location!

Benefits of Dynamic Paths

  1. Workflow Flexibility: Code works as data moves through environments
  2. Environment Awareness: Different search strategies per environment
  3. Fallback Logic: Automatic fallback to alternative locations
  4. Code Stability: No code changes needed when paths change

Common Patterns

data: !expr list(DEV = '/dev/path', PROD = '/prod/path')
envsetup_environ: 'DEV'  # Searches all locations starting from DEV
data: !expr list(DEV = '/dev/path', PROD = '/prod/path')
envsetup_environ: 'PROD'  # Searches only PROD location

Next Steps

The next guide covers automatic script sourcing, which lets you automatically load custom functions from multiple script libraries across environments.