Python-Style Comprehension in R — py_comprehension • scrubwren

Evaluate Python-like comprehensions in R. Supports multiple nested loops with optional conditions, returning results in a chosen Python collection type (list, tuple, set, or dict) or as a regular R list. Conceptually, it works like lapply() for Python objects, but this function offers additional return types and more flexible control over iteration.

Usage

py_comprehension(
  loop_spec_list,
  body,
  env = parent.frame(),
  format = py_builtins$list()
)

Arguments

loop_spec_list: Formula/List. A single formula or a list of formulas specifying the loops. Each formula must have the form var ~ iterable or var ~ iterable | condition, where var is the loop variable and iterable is a Python iterable. When multiple formulas are provided, each additional formula defines a deeper loop, with the last formula representing the innermost loop. The condition will be wrapped using py_builtins$bool(), so it can be either an R boolean or a Python value compatible with Python’s truth-testing procedure.
body: Expression. An R expression to evaluate inside the innermost loop. Its result is collected into the comprehension output.
env: Environment. The parent environment for evaluation; defaults to the caller's environment. The comprehension is evaluated in a new environment created on top of this provided env, so that variables created or modified inside body do not affect the outer environment unless global modification is explicitly used.
format: A Python collection type or a regular R list to store the results. Defaults to a Python list (py_builtins$list()).

Value

A Python collection or R list containing the results of the comprehension.

Details

Dictionary return type

When format = py_builtins$dict(), each evaluation of body must be a key-value pair. Valid pair formats include:

A Python tuple of length 2
A Python list of length 2
A regular R list of length 2

The first element is used as the key and the second as the value. This allows creating dictionaries in Python style directly from R comprehensions.

Side effects

Like lapply(), this function evaluates body in a local scope, so assignments normally do not affect the caller environment. To produce side effects, use <<-, assign() with a suitable environment, or modify an environment variable directly (see environment()).

Performance warnings

Looping over Python objects in R can be inefficient. In each iteration, reticulate must pass handles between R and Python, often performing implicit or explicit object conversions and copies. If the body of your loop is lightweight and you need to iterate over a large Python object, consider defining a Python function via reticulate::py_run_string() or py_builtins$exec() and calling it directly. You can also use r.var (where var is any R variable name) to access or assign R objects directly from Python, which may help avoid unnecessary data transfer. Native Python tools are significantly faster in such cases!

For example, instead of doing:

y <- py_comprehension(x ~ reticulate::r_to_py(1:100000), x^2)

it is better to do:

function_def <- "
def list_square(n):
    return [x ** 2 for x in range(1, n + 1)]
"
my_func <- reticulate::py_run_string(function_def, local = TRUE, convert = FALSE)
y <- my_func$list_square(100000L)

Note that we need to set local = TRUE, so that the returned dictionary is not within the main module. The main module created by reticulate automatically converts Python objects to R objects, unless we disable this behavior for the entire module. Doing so, however, could interfere with reticulate's internals. Defining the function in a private dictionary with convert = FALSE allows us to keep objects as native Python types, which is important when working with large integers that could otherwise overflow when converted back to R.

Examples

if (FALSE) { # \dontrun{
# Simple Python-style comprehension
py_comprehension(
  i ~ reticulate::r_to_py(1:3),
  i^2
)

# Nested loops with conditions
test <- reticulate::r_to_py(list(list(1, 2, 3), list(1, 2), list(1)))
py_comprehension(
  c(
    x ~ test | py_builtins$len(x) > 1,
    z ~ x | z > 1
  ),
  {
    a <- z + 1
    a
  }
)

# Nested comprehension
py_comprehension(i ~ reticulate::r_to_py(1:5),
  py_comprehension(j ~ reticulate::r_to_py(1:5) | j >= i, j)
)

# Return results as a regular R list
py_comprehension(
  i ~ reticulate::r_to_py(1:3),
  i^2,
  format = list()
)

# Return results as a Python tuple
py_comprehension(
  i ~ reticulate::r_to_py(1:3),
  i^2,
  format = py_builtins$tuple()
)

# Return results as a Python set
py_comprehension(
  i ~ reticulate::r_to_py(c(1, 2, 2, 3)),
  i^2,
  format = py_builtins$set()
)

# Return results as a Python dict
py_comprehension(
  i ~ reticulate::r_to_py(1:3),
  list(i, i^2),
  format = py_builtins$dict()
)
} # }