Introducing discretes 0.1.0
I kept running into the same issue working with discrete probability distributions: representing a potentially infinite set of outcomes, and in a way that holds up under transformation.
On paper, this is straightforward. You write something like \[\{1, 2, 3, \ldots\}\] and move on.
In R, the closest analogue is a vector, which means choosing a cutoff like:
1:1000That makes me feel uncomfortable, especially for software that is supposed to span multiple use cases. Once a computation needs to go beyond 1000, you’re either left with failed functionality, or ending up needing a large, unwieldy vector.
The problem would be easy if it only ever involved integers. But as soon as you apply a transformation like inversions and combinations, the discrete values become harder to track. For example, invert our friend \(\{1, 2, 3, \ldots \}\) and you get \[\left\{1, \frac{1}{2}, \frac{1}{3}, \ldots \right\}.\] This is still a discrete set, but it no longer behaves like something you can index in order starting from a minimum. The values accumulate near zero (a “sink”) and there is no next element greater than 0.
The discretes package came out of trying to work with that structure directly, without forcing it into a finite representation first. It encodes the entire series by the rule that generates the series, and allows you to traverse the series and further manipulate them. You can also know if there are any sinks, and where they are.
Now \(\{1, 2, 3, \ldots\}\) gets its own representation:
library(discretes)
x <- natural1()
x## Integer series of length Inf:
## 1, 2, 3, 4, 5, 6, ...Inversion works as you’d expect it to. What are the previous 5 values in the inverse series going backwards from 1?
y <- 1 / natural1()
prev_discrete(y, from = 1, n = 5)## [1] 0.5000000 0.3333333 0.2500000 0.2000000 0.1666667This inverted series indeed has a sink at 0:
has_sink_at(y, 0)## [1] TRUENumeric series are intended to operate like a numeric vector where possible, sorted from smallest to largest, and with no duplicates. Perhaps the best example of this is indexing.
x[1]## [1] 1While the above natural numbers have a well-defined “first” value, the inverse series does not, and indexing behaves like atomic vectors would:
y[1]## [1] NAThe discretes package is now avaialable on CRAN; learn more at its homepage, https://discretes.netlify.app/. While not a core probaverse package, it will be an important component for the probaverse project because explicit support for discrete (and mixed!) probability distributions becomes possible. I hope to see this functionality rolled out in the coming months, but progress on that work will depend on future funding.