Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Obtaining the simplicial complex for a given threshold #172

Open
davibarreira opened this issue Jun 6, 2024 · 1 comment
Open

Obtaining the simplicial complex for a given threshold #172

davibarreira opened this issue Jun 6, 2024 · 1 comment

Comments

@davibarreira
Copy link

First of all. Thanks for this package! I'm trying to learn a bit of TDA, and I'm trying to understand the outputs of the package. One thing I was wondering was whether it is possible to obtain the simplicial complex for a given threshold. For example, given a distance matrix and a "radius" of 0.1, I wanted to obtain the simplicial complex in order to visualize connections as the radius increased. Is it possible with the package? I saw that there is a Simplex data structure, but it does not seem to have the actual connections for the given input points.

@mtsch
Copy link
Owner

mtsch commented Jun 7, 2024

Hey! While you can't really construct a complex explicitly, you can enumerate the simplices in a given filtration. This is a bit hacky, but it works reliably. I'll demonstrate it with an example.

Start by creating some data and a filtration. Note that here, I've set the threshold in the filtration itself. If you want to compare different thresholds it might be better to leave it and filter the vector of simplices later (I show how to do this later).

julia> using Ripserer

# Create some data and a filtration on it
julia> data = [(rand(), rand()) for _ in 1:100]
julia> flt = Rips(data; threshold=0.1)

To get simplices, we'll use the (maybe not the most intuitively-named) Ripserer.columns_to_reduce. It takes a filtration and vector of all its d-simplices, and returns a vector of (d+1)-simplices. To get the starting point, use edges.

julia> edges = Ripserer.edges(flt)
julia> triangles = Ripserer.columns_to_reduce(flt, edges)
julia> tetrahedra = Ripserer.columns_to_reduce(flt, triangles)

You can keep doing this until you get to the dimension you're interested in. columns_to_reduce returns an iterator. Use collect to materialize it in a vector. Be careful here, these can get extremely large very quickly and could eat up all your memory.

julia> simplices = collect(tetrahedra)
25-element Vector{Simplex{3, Float64, Int64}}:
 -Simplex{3}((66, 34, 8, 1), 0.07873421536359702)
 -Simplex{3}((72, 47, 30, 2), 0.09496424105080373)
 -Simplex{3}((72, 47, 43, 2), 0.08662338431217294)
 -Simplex{3}((77, 70, 38, 6), 0.0879042651018931)
 -Simplex{3}((77, 44, 38, 6), 0.09089322446008455)
 -Simplex{3}((70, 44, 38, 6), 0.09089322446008455)
 -Simplex{3}((77, 70, 44, 6), 0.09089322446008455)
 -Simplex{3}((90, 76, 34, 8), 0.09562660381589932)
 -Simplex{3}((90, 66, 34, 8), 0.08434831089518131)
 -Simplex{3}((76, 66, 34, 8), 0.0986279317036634)
 
 -Simplex{3}((82, 51, 25, 24), 0.09448718717896652)
 -Simplex{3}((90, 76, 66, 34), 0.0986279317036634)
 -Simplex{3}((82, 71, 61, 35), 0.08840635543753256)
 -Simplex{3}((71, 65, 61, 35), 0.054352593410610384)
 -Simplex{3}((77, 70, 44, 38), 0.0625889837788639)
 -Simplex{3}((78, 75, 69, 49), 0.09192425486943887)
 -Simplex{3}((78, 74, 69, 49), 0.09709293326452263)
 -Simplex{3}((75, 74, 69, 49), 0.09709293326452263)
 -Simplex{3}((78, 75, 74, 49), 0.09709293326452263)
 -Simplex{3}((78, 75, 74, 69), 0.05476974181818138)

A simplex in Ripserer holds three pieces information: the sign, vertices and birth value. The sign returned by columns_to_reduce is arbitrary. You may want to use abs on the simplex to make it positive. You can use the vertices (or the simplex itself) to index back in to your data. As an example, lets see which tetrahedron here has the largest birth value:

julia> largest_birth_simplex = abs(argmax(birth, simplices))
3-dimensional Simplex(index=1259666, birth=0.0986279317036634):
   +(76, 66, 34, 8)

julia> [data[v] for v in Ripserer.vertices(largest_birth_simplex)]
4-element Vector{Tuple{Float64, Float64}}:
 (0.05901024586093817, 0.8981589899299338)
 (0.15531409254671724, 0.8768742999941058)
 (0.1546038616769584, 0.8956474221760998)
 (0.13120308962828497, 0.9069856484444736)

julia> data[largest_birth_simplex]
4-element StaticArraysCore.SizedVector{4, Tuple{Float64, Float64}, Vector{Tuple{Float64, Float64}}} with indices SOneTo(4):
 (0.05901024586093817, 0.8981589899299338)
 (0.15531409254671724, 0.8768742999941058)
 (0.1546038616769584, 0.8956474221760998)
 (0.13120308962828497, 0.9069856484444736)

If you want to reduce the threshold at this point, use filter like so:

julia> filter(sx -> birth(sx) < 0.09, simplices)
10-element Vector{Simplex{3, Float64, Int64}}:
 -Simplex{3}((66, 34, 8, 1), 0.07873421536359702)
 -Simplex{3}((72, 47, 43, 2), 0.08662338431217294)
 -Simplex{3}((77, 70, 38, 6), 0.0879042651018931)
 -Simplex{3}((90, 66, 34, 8), 0.08434831089518131)
 -Simplex{3}((55, 16, 15, 13), 0.08733924486750108)
 -Simplex{3}((95, 52, 46, 22), 0.0801360237769744)
 -Simplex{3}((82, 71, 61, 35), 0.08840635543753256)
 -Simplex{3}((71, 65, 61, 35), 0.054352593410610384)
 -Simplex{3}((77, 70, 44, 38), 0.0625889837788639)
 -Simplex{3}((78, 75, 74, 69), 0.05476974181818138)

As for plotting these I suggest collecting triangles and plotting the connections between all pairs in the simplex, or just collecting the edges and drawing a line for each of them.

Hope this is helpful! Let me know if you have any other questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants