Open source SASA calculations
|View on GitHub|
The API is found in the header freesasa.h. The other source-files and headers in the repository are for internal use, and are not presented here, but are documented in the source itself. The file example.c contains a simple program that illustrates how to use the API to read a PDB file from
stdin and calculate and print the SASA.
To calculate the SASA of a structure, there are two main options:
The following explains how to use FreeSASA to calculate the SASA of a fictive PDB file (1abc.pdb). At each step one or more error checks should have been done, but these are ignored here for brevity. See the documentation of each function to see what errors can occur. Default parameters are used at every step, the section Customizing behavior explains how to configure the calculations.
The function freesasa_structure_from_pdb() reads the atom coordinates from a PDB file and assigns a radius to each atom. The third argument can be used to pass options for how to read the PDB file.
We are commonly interested in the polar and apolar areas of a molecule, this can be calculated by freesasa_result_classes(). To get other classes of atoms we can either define our own classifier, or use freesasa_select_area() defined in the next section. The return type freesasa_nodearea is a struct contains the total area and the area of all apolar and polar atoms, and main-chain and side-chain atoms.
Groups of atoms can be defined using freesasa_selection_new(), which takes a selection definition uses a subset of the Pymol select syntax
In addition to the flat array of results in freesasa_result, and the global values returned by freesasa_result_classes(), FreeSASA has an interface for navigating the results as a tree. The leaf nodes are individual atoms, and there are parent nodes at the residue, chain, and structure levels. The function freesasa_calc_tree() does a SASA calculation and returns the root node of such a tree. (If one already has a freesasa_result the function freesasa_tree_init() can be used instead). Each node stores a freesasa_nodearea for the sum of all atoms belonging to the node. The tree can be traversed with freesasa_node_children(), freesasa_node_parent() and freesasa_node_next(), and the area, type and name using freesasa_node_area(), freesasa_node_type() and freesasa_node_name(). Additionally there are special properties for each level of the tree.
The tree structure can also be exported to an RSA, JSON or XML file using freesasa_tree_export(). The RSA format is fixed, but the user can select which levels of the tree to include in JSON and XML. The following illustrates how one would generate a tree and export it to XML, including nodes for the whole structure, chains and residues (but excluding individual atoms).
If users wish to supply their own coordinates and radii, these are accepted as arrays of doubles passed to the function freesasa_calc_coord(). The coordinate-array should have size 3*n with coordinates in the order
The principle for error handling is that unpredictable errors should not cause a crash, but rather allow the user to exit gracefully or make another attempt. Therefore, errors due to user or system failures, such as faulty parameters, malformatted config-files, I/O errors or out of memory errors, are reported through return values, either FREESASA_FAIL or FREESASA_WARN, or by
NULL pointers, depending on the context (see the documentation for the individual functions).
Errors that are attributable to programmers using the library, such as passing null pointers where not allowed, are checked by asserts.
It should be clear from the documentation when the other functions have side effects such as memory allocation and I/O, and thread-safety should generally not be an issue (to the extent that your C library has threadsafe I/O and dynamic memory allocation). The SASA calculation itself can be parallelized by using a freesasa_parameters struct with freesasa_parameters.n_threads > 1 (default is 2) where appropriate. This only gives a significant effect on performance for large proteins or at high precision, and because not all steps are parallelized it is usually not worth it to go beyond 2 threads.
The types freesasa_parameters and freesasa_classifier can be used to change the parameters of the calculations. Users who wish to use the defaults can pass
NULL wherever pointers to these are requested.
Calculation parameters can be stored in a freesasa_parameters object. It can be initialized to default by
The following code would run a high precision Shrake & Rupley calculation with 10000 test points on the provided structure.
Classifiers are used to determine which atoms are polar or apolar, and to specify atomic radii. In addition the three standard classifiers (see below) have reference values for the maximum areas of the 20 standard amino acids which can be used to calculate relative areas of residues, as in the RSA output.
The default classifier is available through the const variable freesasa_default_classifier. This uses the ProtOr radii, defined in the paper by Tsai et al. (JMB 1999, 290: 253) for the standard amino acids (20 regular plus SEC, PYL, ASX and GLX), for some capping groups (ACE/NH2) and the standard nucleic acids. If the element can't be determined or is unknown, a zero radius is assigned. It classes all carbons as apolar and all other known atoms as polar.
Users can provide their own classifiers through Classifier configuration files. At the moment these do not allow the user to specify reference values to calculate relative SASA values for RSA output.
The default behavior of freesasa_structure_from_pdb(), freesasa_structure_array(), freesasa_structure_add_atom() and freesasa_structure_add_atom_wopt() is to first try the provided classifier and then guess the radius if necessary (emitting warnings if this is done, uses VdW radii defined by Mantina et al. J Phys Chem 2009, 113:5806).
See the documentation for these functions for what parameters to use to change the default behavior.