Pyntacle is a Python package and a command-line tool that eases the analysis of graphs. Its main goal is the search for important components of graphs and the way it does it is based on topological indices which are tailored on the concepts of reachability, fragmentation and centrality. It implements and provides ancillary methods for community finding, set operations between graphs and quick data type conversion tools. Pyntacle relies on multi-core/process programming paradigms to speedup the execution of complex analysis routines. In the current release, Pyntacle enables GPU-computing experimentally via APIs.
A group of nodes in a network is central if its nodes are important, not necessarily individually. Centrality is assessed in several ways by the calculation of topological metrics.
Generally, these metrics either refer to local metrics, as e.g. the degree, closeness and betweenness centrality indices, thereby extending their calculations to groups of nodes,
or are based on new concepts of fragmentation and reachability. A quick introduction to get acquainted with group centrality and with the strategies that Pyntacle implements to search for
relevant groups in a network is available here.
The easiest way to install Pyntacle on any Linux, Mac or Windows based system is through Conda. We recommend Miniconda, which is a lightweight version. Indeed, installing Pyntacle and all its dependencies can be challenging for inexperienced users. There are several advantages in using Anaconda to install not only Pyntacle, but also Python and other packages: it is cross platform (Linux, MacOS X, Windows), you do not require administrative rights to install it (it goes in the user home directory), it allows you to work in virtual environments, which can be used as safe sandbox-like sub-systems that can be created, used, exported or deleted at your will.
You can choose between the full Anaconda and its lite version, Miniconda. The difference between the two is that Anaconda comes with hundreds of packages and can be a bit heavier to install, while Miniconda allows you to create a minimal, self-contained Python installation, and then use the Conda command to install additional packages of your choice.
In any case, Conda is the package manager that the Anaconda and Miniconda distributions are built upon. It is both cross-platform and language agnostic (it can play a similar role to a pip and virtualenv combination), and you need to set it up by running either the Anaconda installer or the Miniconda installer, choosing the Python 3.7 version.
The next step is to create a new Conda environment (if you are familiar with virtual environments, this is analogous to a virtualenv).
Run the following commands from a terminal window:
conda create -n name_of_my_env python=3.7
This will create a minimal environment with only Python v.3.7 installed in it. To put your self inside this environment run:
source activate name_of_my_env
And finally, install the latest version of Pyntacle:
conda install -y -c bfxcss -c conda-forge pyntacle
Open a Windows prompt or (even better) an Anaconda prompt, and type:
conda create -y -n name_of_my_env python=3.7
Then, activate the newly created environment:
conda activate name_of_my_env
Finally, install the latest version of Pyntacle:
conda install -y -c bfxcss -c conda-forge pyntacle
Alternatively, it is also possible to build Pyntacle from source (Linux and Mac only). The archive is available from the official Releases section of GitHub. For more detailed instructions, read the documentation on GitHub.
Pyntacle can also be executed in a ready-to-go Docker container. If you are familiar with Docker, a fully functional image is available for download from
DockerHub
Alternatively, you can build your own Docker image using this Dockerfile
We developed a series of unit tests to ensure that the Pyntacle command-line interface is working properly.
We recommend to run these tests before using Pyntacle.
In a shell, type:
pyntacle test
The expected output should be
Ran 27 tests in 8.002s
OK
<pyntacle.pyntacle.App object at 0x7f7c22f8be10>
This message shows that all the Pyntacle tests ended successfully and that its command-line is ready to use. Otherwise, please contact us and specify your OS, its version, the command you used to install Pyntacle and the output of the tests. You can redirect the output of the tests to a file (testlog.txt) as follows:
pyntacle test >> testlog.txt 2>&1
↑ back
A quick start guide and three case studies are available to ease the approach of the inexperienced user to the basic Pyntacle
commands and to the ways it may be proficiently used in common network analysis contexts.
Since the command-line interface of Pyntacle was designed for not-experts, the fine-grained parallelism that deal with the enumeration of all shortest-paths was hidden.
The coarse-grained parallelism is instead tunable by the argument -O/--nprocs
of the brute-force search algorithm. Thus, according to the size of a graph, its
level of sparseness and the number of employed processors, the computing mode is chosen based on this simple algorithm:
// auto-select the computing mode
if nprocs > 1 // n.b., nprocs is user-defined
Let's enable multi-process and disable multi-threading
else if size(graph) < 250 or rho(graph)<0.5 //rho measures sparseness
Let's disable multi-process and disable multi-threading
else
Let's disable multi-process and enable multi-threading
When multi-threading gets enabled, the number of spawned threads equals that of available cores -1. This default setting can be altered by setting the environmental variable
NUMBA_NUM_THREADS
to the number of desired computing cores.
However, caution must be paid on this: Numba adjusts the number of active threads on-the-fly according to the current overheads and, hence, the efficiency of parallelism.
This means that what specified in the environment variable might not be actually respected.
Multi-threading is however controllable via APIs, as follows:
from algorithms.bruteforce_search import BruteforceSearch
from io_stream.generator import PyntacleGenerator
from tools.enums import CmodeEnum, KpposEnum
if __name__ == '__main__':
graph_rnd = PyntacleGenerator.Random([100, 0.6])
start = time.perf_counter()
# Multi-threaded
mreach_s = BruteforceSearch.reachability(
graph_rnd,
2,
KpposEnum.mreach,
None,
m=2,
cmode=CmodeEnum.cpu,
nprocs=1) # this is the default choice
end = time.perf_counter()
print("--- Elapsed time: {:.2f} seconds ---".format(end - start))
GPU-base processing is an experimental feature in the current version and, then, not covered by the command-line interface. This is because of weird behaviors of Numba
with some hardware configuration that might puzzle the user. The GPU feature will be stable in the release 2.0, when Pyntacle will cover the possibility to manage big matrices
for which replacing fine-grained parallelism with GPU computing would make sense.
However, GPU-computing can be enabled by APIs:
...
BruteforceSearch.reachability(
graph_rnd,
2,
KpposEnum.mreach,
None,
m=2,
cmode=CmodeEnum.gpu,
nprocs=1)
...
Pyntacle is constantly maintained and new features are added, also on user request through the GitHub repository. Here, you can track its development, find information about issues, future plans and new features.
Pyntacle is available under the GNU General Public License v3.0.