Skip to contents

The autos configuration automatically sources R scripts from specified directories, making custom functions immediately available without manual sourcing. This is perfect for project-specific utility functions and shared code libraries.

Basic AUTOS Configuration

default:
  autos:
    script_library: '/path/to/your/scripts'

Note: By default, auto-sourcing will overwrite any existing functions with the same name. This behavior can be controlled through the overwrite parameter in the underlying functions, though this is typically managed automatically by the system.

Working Example: Single Script Library

Let’s create a practical example where Tidy McVerse has custom functions in a script library:

library(envsetup)

# Create temporary directory structure
dir <- fs::file_temp()
dir.create(dir)
dir.create(file.path(dir, "/demo/DEV/username/project1/script_library"), recursive = TRUE)

# Create a custom function
file_conn <- file(file.path(dir, "/demo/DEV/username/project1/script_library/test.R"))
writeLines(
"test <- function(){print('Hello from auto-sourced function!')}", file_conn)
close(file_conn)

# Write the configuration
config_path <- file.path(dir, "_envsetup.yml")
file_conn <- file(config_path)
writeLines(
  paste0(
"default:
  autos:
    dev_script_library: '", dir,"/demo/DEV/username/project1/script_library'"
  ), file_conn)
close(file_conn)

Loading and Using Auto-Sourced Functions

# Load configuration and apply it
envsetup_config <- config::get(file = config_path)
rprofile(envsetup_config)
#> Assigned paths to __callr_data__Assigned paths to R_GlobalEnv
#> Sourcing file:  '/tmp/Rtmpjcz3eR/file1dcb53d0b1a/demo/DEV/username/project1/script_library/test.R' 
#> 
#>  The following objects are added to .GlobalEnv:
#> 
#>     'test'

The auto-sourced functions are now available:

# See what functions are available
objects()
#> [1] "auto_stored_envsetup_config" "config_path"                
#> [3] "dir"                         "envsetup_config"            
#> [5] "file_conn"                   "test"

# Use the function directly (no manual sourcing needed!)
test()
#> [1] "Hello from auto-sourced function!"

Multiple Script Libraries

Real projects often have multiple script libraries for different purposes:

# Create production script library
dir.create(file.path(dir, "/demo/PROD/project1/script_library"), recursive = TRUE)

# Add production functions
file_conn <- file(file.path(dir, "/demo/PROD/project1/script_library/test2.R"))
writeLines(
"test2 <- function(){print('Hello from production function!')}", file_conn)
close(file_conn)

# Update configuration with multiple libraries
file_conn <- file(config_path)
writeLines(
  paste0(
"default:
  autos:
    dev_script_library: '", dir,"/demo/DEV/username/project1/script_library'
    prod_script_library: '", dir,"/demo/PROD/project1/script_library'"
  ), file_conn)
close(file_conn)

# Reload configuration
envsetup_config <- config::get(file = config_path)
rprofile(envsetup_config)
#> Assigned paths to __callr_data__Assigned paths to R_GlobalEnv
#> Sourcing file:  '/tmp/Rtmpjcz3eR/file1dcb53d0b1a/demo/DEV/username/project1/script_library/test.R' 
#> 
#>  The following objects are added to .GlobalEnv:
#> 
#>     'test'
#> 
#> Sourcing file:  '/tmp/Rtmpjcz3eR/file1dcb53d0b1a/demo/PROD/project1/script_library/test2.R' 
#> 
#>  The following objects are added to .GlobalEnv:
#> 
#>     'test2'

Using Multiple Libraries

# Check search path - now includes both libraries
# Functions from both libraries are available
objects()
#> [1] "auto_stored_envsetup_config" "config_path"                
#> [3] "dir"                         "envsetup_config"            
#> [5] "file_conn"                   "test"                       
#> [7] "test2"

# Use functions from both libraries
test()   # From dev library
#> [1] "Hello from auto-sourced function!"
test2()  # From prod library
#> [1] "Hello from production function!"

Understanding Function Conflicts and the Overwrite Parameter

When auto-sourcing functions, you might encounter situations where function names conflict with existing objects in your environment. The overwrite parameter controls how these conflicts are handled.

Quick Example of Function Conflicts

# Create a function that might conflict
summary_stats <- function(data) {
  print("Original summary function")
}

# Create a script with the same function name
conflict_dir <- file.path(dir, "conflict_demo")
dir.create(conflict_dir)

file_conn <- file(file.path(conflict_dir, "stats.R"))
writeLines(
"summary_stats <- function(data) {
  print('Updated summary function from the new conflict_demo script')
}", file_conn)
close(file_conn)

# Add to configuration
file_conn <- file(config_path)
writeLines(
  paste0(
"default:
  autos:
    dev_script_library: '", dir,"/demo/DEV/username/project1/script_library'
    prod_script_library: '", dir,"/demo/PROD/project1/script_library'
    conflict_demo: '", conflict_dir, "'"
  ), file_conn)
close(file_conn)

# When we reload, the auto-sourced version will overwrite the original
envsetup_config <- config::get(file = config_path)
rprofile(envsetup_config)
#> Assigned paths to __callr_data__Assigned paths to R_GlobalEnv
#> Sourcing file:  '/tmp/Rtmpjcz3eR/file1dcb53d0b1a/demo/DEV/username/project1/script_library/test.R' 
#> 
#>  The following objects are added to .GlobalEnv:
#> 
#>     'test'
#> 
#> Sourcing file:  '/tmp/Rtmpjcz3eR/file1dcb53d0b1a/demo/PROD/project1/script_library/test2.R' 
#> 
#>  The following objects are added to .GlobalEnv:
#> 
#>     'test2'
#> 
#> Sourcing file:  '/tmp/Rtmpjcz3eR/file1dcb53d0b1a/conflict_demo/stats.R' 
#> 
#>  The following objects are added to .GlobalEnv:
#> 
#>     'summary_stats'
#> 
#>  The following objects were overwritten in .GlobalEnv:
#> 
#>     'summary_stats'

# Test which version we have now
summary_stats()
#> [1] "Updated summary function from the new conflict_demo script"

The output shows detailed information about what was overwritten, helping you track conflicts.

Environment-Specific Auto-Sourcing

You might want different script libraries for different environments. For example, exclude development functions when running in production:

# Configuration that blanks out dev scripts in production
file_conn <- file(config_path)
writeLines(
  paste0(
"default:
  autos:
    dev_script_library: '", dir,"/demo/DEV/username/project1/script_library'
    prod_script_library: '", dir,"/demo/PROD/project1/script_library'

prod:
  autos:
    dev_script_library: NULL"  # NULL disables this library
  ), file_conn)
close(file_conn)

# Load production configuration
envsetup_config <- config::get(file = config_path, config = "prod")
rprofile(envsetup_config)
#> Assigned paths to __callr_data__Assigned paths to R_GlobalEnv
#> Sourcing file:  '/tmp/Rtmpjcz3eR/file1dcb53d0b1a/demo/PROD/project1/script_library/test2.R' 
#> 
#>  The following objects are added to .GlobalEnv:
#> 
#>     'test2'

So we can see now that only production functions are available:

# Functions from production only
objects()
#> [1] "auto_stored_envsetup_config" "config_path"                
#> [3] "conflict_dir"                "dir"                        
#> [5] "envsetup_config"             "file_conn"                  
#> [7] "test2"

# Use functions from production only
test2()  # From prod library
#> [1] "Hello from production function!"

How Auto-Sourcing Works

When you call rprofile() with autos configuration:

  1. Script Discovery: Finds all .R files in specified directories
  2. Conflict Detection: Compares new functions with existing global environment objects
  3. Automatic Sourcing: Sources each script into its environment
  4. Conflict Resolution: Based on the overwrite parameter: 5 - overwrite = TRUE (default): Replaces existing functions and reports what was overwritten
    • overwrite = FALSE: Preserves existing functions and reports what was skipped . Metadata Tracking: Records which script each function came from for debugging
  5. Function Availability: Functions become directly accessible

Technical Details of Conflict Handling

The auto-sourcing system uses a sophisticated approach to handle conflicts:

  • Temporary Environment: Each script is first sourced into a temporary environment
  • Object Comparison: New objects are compared against the global environment
  • Selective Assignment: Only specified objects are moved to the global environment
  • Metadata Recording: Each function’s source script is recorded in object_metadata
  • Detailed Reporting: Users receive clear feedback about what was added, skipped, or overwritten
  • Cleanup Integration: Metadata enables precise cleanup when using detach_autos()

The record_function_metadata() function creates a comprehensive audit trail by maintaining a data frame with:

  • object_name: The name of each sourced function

  • script: The full path to the source script

  • This is automatically updated when functions are overwritten by newer versions

Benefits of Auto-Sourcing

  1. No Manual Sourcing: Functions are automatically available
  2. Organized Libraries: Separate environments for different script collections
  3. Environment Isolation: Functions don’t interfere with each other
  4. Dynamic Loading: Easy to add/remove script libraries
  5. Team Collaboration: Shared function libraries across team members
  6. Comprehensive Tracking: Metadata system tracks function sources for debugging
  7. Intelligent Cleanup: Precise removal of auto-sourced functions via metadata

Common Use Cases

Project Utilities

autos:
  project_utils: '/project/utilities'
  data_processing: '/project/data_functions'
  plotting_functions: '/project/viz_functions'

Environment-Specific Functions

default:
  autos:
    dev_helpers: '/dev/helper_functions'
    shared_utils: '/shared/utilities'

prod:
  autos:
    shared_utils: '/shared/utilities'
    # dev_helpers excluded in production

Team Libraries

autos:
  team_functions: '/shared/team_library'
  personal_utils: '~/my_r_functions'
  project_specific: './project_functions'

Managing Function Conflicts with the Overwrite Parameter

The overwrite parameter controls how auto-sourcing handles situations where functions with the same name already exist in your global environment. Understanding this parameter is crucial for managing function conflicts effectively.

Default Behavior: Overwrite = TRUE

By default, auto-sourcing will overwrite existing functions:

# Create a function in global environment
my_function <- function() {
  print("Original function from global environment")
}

# Check it works
my_function()
#> [1] "Original function from global environment"

# Create a script with the same function name
dir <- fs::file_temp()
dir.create(dir)
script_dir <- file.path(dir, "scripts")
dir.create(script_dir)

file_conn <- file(file.path(script_dir, "my_function.R"))
writeLines(
"my_function <- function() {
  print('Updated function from auto-sourced script')
}", file_conn)
close(file_conn)

# Configuration with default overwrite = TRUE
config_path <- file.path(dir, "_envsetup.yml")
file_conn <- file(config_path)
writeLines(
  paste0(
"default:
  autos:
    my_scripts: '", script_dir, "'"
  ), file_conn)
close(file_conn)

# Load configuration - this will overwrite the existing function
envsetup_config <- config::get(file = config_path)
rprofile(envsetup_config)
#> Assigned paths to __callr_data__Assigned paths to R_GlobalEnv
#> Sourcing file:  '/tmp/Rtmpjcz3eR/file1dcb1318486/scripts/my_function.R' 
#> 
#>  The following objects are added to .GlobalEnv:
#> 
#>     'my_function'
#> 
#>  The following objects were overwritten in .GlobalEnv:
#> 
#>     'my_function'

# The function has been overwritten
my_function()
#> [1] "Updated function from auto-sourced script"

Conservative Behavior: Overwrite = FALSE

When overwrite = FALSE, existing functions are preserved:

# clean up previous runs, removing all previously attached autos
detach_autos()

# Create a function in global environment
my_function <- function() {
  print("Original function from global environment")
}

# Check it works
my_function()
#> [1] "Original function from global environment"

# Create a script with the same function name
dir <- fs::file_temp()
dir.create(dir)
script_dir <- file.path(dir, "scripts")
dir.create(script_dir)

file_conn <- file(file.path(script_dir, "my_function.R"))
writeLines(
"my_function <- function() {
  print('Updated function from auto-sourced script')
}", file_conn)
close(file_conn)

# Configuration with default overwrite = FALSE
config_path <- file.path(dir, "_envsetup.yml")
file_conn <- file(config_path)
writeLines(
  paste0(
"default:
  autos:
    my_scripts: '", script_dir, "'"
  ), file_conn)
close(file_conn)

envsetup_config <- config::get(file = config_path)
rprofile(envsetup_config, overwrite = FALSE)
#> Assigned paths to __callr_data__Assigned paths to R_GlobalEnv
#> Sourcing file:  '/tmp/Rtmpjcz3eR/file1dcb3f057307/scripts/my_function.R' 
#> 
#>  The following objects were not added to .GlobalEnv as they already exist:
#> 
#>     'my_function'

my_function()
#> [1] "Original function from global environment"

Understanding Conflict Detection

The auto-sourcing system provides detailed feedback about what happens during sourcing:

# Create multiple functions to demonstrate conflict detection
existing_func1 <- function() "I exist in global"
existing_func2 <- function() "I also exist in global"

# Create script with mix of new and conflicting functions
file_conn <- file(file.path(script_dir, "mixed_functions.R"))
writeLines(
"# This will conflict with existing function
existing_func1 <- function() {
  'Updated from script'
}

# This is a new function
new_func <- function() {
  'Brand new function'
}

# This will also conflict
existing_func2 <- function() {
  'Also updated from script'
}", file_conn)
close(file_conn)

# Update configuration
file_conn <- file(config_path)
writeLines(
  paste0(
"default:
  autos:
    my_scripts: '", script_dir, "'"
  ), file_conn)
close(file_conn)

# Reload - watch the detailed output
envsetup_config <- config::get(file = config_path)
rprofile(envsetup_config)
#> Assigned paths to __callr_data__Assigned paths to R_GlobalEnv
#> Sourcing file:  '/tmp/Rtmpjcz3eR/file1dcb3f057307/scripts/mixed_functions.R' 
#> 
#>  The following objects are added to .GlobalEnv:
#> 
#>     'existing_func1', 'existing_func2', 'new_func'
#> 
#>  The following objects were overwritten in .GlobalEnv:
#> 
#>     'existing_func1', 'existing_func2'
#> 
#> Sourcing file:  '/tmp/Rtmpjcz3eR/file1dcb3f057307/scripts/my_function.R' 
#> 
#>  The following objects are added to .GlobalEnv:
#> 
#>     'my_function'
#> 
#>  The following objects were overwritten in .GlobalEnv:
#> 
#>     'my_function'

Function Metadata Tracking

The auto-sourcing system includes sophisticated metadata tracking that records detailed information about every function that gets sourced. This tracking system is invaluable for debugging, auditing, and understanding your function ecosystem.

How Metadata Tracking Works

Every time a function is sourced through the autos system, the record_function_metadata() function captures:

  • Object Name: The name of the function or object
  • Source Script: The full path to the script file that contains the function

This information is stored in a special object_metadata data frame within the envsetup_environment.

Accessing Function Metadata

# After sourcing functions, you can access the metadata
# Note: This example shows the concept - actual access depends on envsetup internals

# Create some functions to demonstrate metadata tracking
metadata_demo_dir <- file.path(dir, "metadata_demo")
dir.create(metadata_demo_dir)

# Create multiple scripts with different functions
file_conn <- file(file.path(metadata_demo_dir, "data_functions.R"))
writeLines(
"load_data <- function(file) {
  paste('Loading data from:', file)
}

clean_data <- function(data) {
  paste('Cleaning data with', nrow(data), 'rows')
}", file_conn)
close(file_conn)

file_conn <- file(file.path(metadata_demo_dir, "plot_functions.R"))
writeLines(
"create_plot <- function(data) {
  paste('Creating plot for', ncol(data), 'variables')
}

save_plot <- function(plot, filename) {
  paste('Saving plot to:', filename)
}", file_conn)
close(file_conn)

# Update configuration to include metadata demo
file_conn <- file(config_path)
writeLines(
  paste0(
"default:
  autos:
    metadata_demo: '", metadata_demo_dir, "'"
  ), file_conn)
close(file_conn)

# Source the functions
envsetup_config <- config::get(file = config_path)
rprofile(envsetup_config)
#> Assigned paths to __callr_data__Assigned paths to R_GlobalEnv
#> Sourcing file:  '/tmp/Rtmpjcz3eR/file1dcb3f057307/metadata_demo/data_functions.R' 
#> 
#>  The following objects are added to .GlobalEnv:
#> 
#>     'clean_data', 'load_data'
#> 
#> Sourcing file:  '/tmp/Rtmpjcz3eR/file1dcb3f057307/metadata_demo/plot_functions.R' 
#> 
#>  The following objects are added to .GlobalEnv:
#> 
#>     'create_plot', 'save_plot'

# The system now tracks which script each function came from
cat("Functions sourced with metadata tracking:")
#> Functions sourced with metadata tracking:
knitr::kable(envsetup_environment$object_metadata)
object_name script
clean_data /tmp/Rtmpjcz3eR/file1dcb3f057307/metadata_demo/data_functions.R
create_plot /tmp/Rtmpjcz3eR/file1dcb3f057307/metadata_demo/plot_functions.R
load_data /tmp/Rtmpjcz3eR/file1dcb3f057307/metadata_demo/data_functions.R
save_plot /tmp/Rtmpjcz3eR/file1dcb3f057307/metadata_demo/plot_functions.R

Benefits of Metadata Tracking

1. Debugging Function Issues

When a function isn’t working as expected, metadata helps you quickly identifyw hich script file contains the function

2. Audit Trail

Metadata provides a complete audit trail of your function ecosystem.

Metadata and the detach_autos() Function

The metadata tracking system integrates closely with cleanup operations:

  1. Identify all auto-sourced functions
  2. Remove them from the global environment
  3. Clean up the metadata records

Best Practices

Even though your functions are not a part of a package, you should follow best practices to ensure your functions work as expected.

  1. Use Clear Names: Library names should indicate their purpose
  2. Monitor Conflicts: Regularly check for and resolve function name conflicts
  3. Document Functions: Include roxygen2 comments in your functions
  4. Test Functions: Ensure auto-sourced functions work correctly
  5. Package Prefix: Use package prefix when writing your functions, for example, dplyr::filter