Open source SASA calculations
|View on GitHub|
Building FreeSASA creates the binary
freesasa, which is installed by
make install. Calling
$ freesasa -h
displays a help message listing all options. The following text explains how to use most of them.
In the following we will use the RNA/protein complex PDB structure 3WBM as an example. It has four protein chains A, B, C and D, and two RNA strands X and Y. To run a simple SASA calculation using default parameters, simply type:
$ freesasa 3wbm.pdb
This generates the following output
## FreeSASA 2.0 ## PARAMETERS algorithm : Lee & Richards probe-radius : 1.400 threads : 2 slices : 20 INPUT source : 3wbm.pdb chains : ABCDXY atoms : 3714 RESULTS (A^2) Total : 25190.77 Apolar : 11552.38 Polar : 13638.39 CHAIN A : 3785.49 CHAIN B : 4342.33 CHAIN C : 3961.12 CHAIN D : 4904.30 CHAIN X : 4156.46 CHAIN Y : 4041.08
The results are all in the unit Ångström-squared.
If higher precision is needed, the command
$ freesasa -n 100 3wbm.pdb
specifies that the calculation should use 100 slices per atom instead of the default 20. The command
$ freesasa --shrake-rupley -n 200 --probe-radius 1.2 --n-threads 4 3wbm.pdb
instead calculates the SASA using Shrake & Rupley's algorithm with 200 test points, a probe radius of 1.2 Å, using 4 parallel threads to speed things up.
If the user wants to use their own atomic radii the command
$ freesasa --config-file <file> 3wbm.pdb
Reads a configuration from a file and uses it to assign atomic radii. The program will halt if it encounters atoms in the PDB input that are not present in the configuration. See Classifier configuration files for instructions how to write a configuration.
To use the atomic radii from NACCESS call
$ freesasa --radii=naccess 3wbm.pdb
Another way to specify a custom set of atomic radii is to store them as occupancies in the input PDB file
$ freesasa --radius-from-occupancy 3wbm.pdb
This option allows the user to first use the option
--format=pdb (see PDB) to write generate a PDB file with the radii used in the calculation, modify the radii of individual atoms in that file, and then recalculate the SASA with these modified radii.
In addition to the standard output format above FreeSASA can export the results as JSON, XML, PDB, RSA, SASA of each residue type and SASA of each residue using the option
--format. The level of detail of JSON and XML output can be controlled with the option
--output-depth=<depth> which takes the values
atom is chosen, SASA values are shown for all levels of the structure, including individual atoms. With
chain, only structure and chain SASA values are printed (this is the default).
The output can include relative SASA values for each residues. To calculate these a reference SASA value is needed, calculated using the same atomic radii. At the moment such values are only available for the ProtOr and NACCESS radii (selected using the option
--radii), if other radii are used relative SASA will be excluded (in RSA output all REL columns will have the value 'N/A').
The reference SASA values for residue X are calculated from Ala-X-Ala peptides in a stretched out configuration. The reference configurations are supplied for reference in the directory
rsa. Since these are not always the most exposed possible configuration, and because bond lengths and bond angles vary, the relative SASA values will sometimes be larger than 100 %. At the moment there is no interface to supply user-defined reference values.
$ freesasa --format=xml --output-depth=residue 3wbm.pdb
generates the following
Where ellipsis indicates the remaining residues and chains.
$ freesasa --format=xml 3wbm.pdb
Generates the following
The command-line interface can also be used as a PDB filter:
$ cat 3wbm.pdb | freesasa --format=pdb REMARK 999 This PDB file was generated by FreeSASA 2.0. REMARK 999 In the ATOM records temperature factors have been REMARK 999 replaced by the SASA of the atom, and the occupancy REMARK 999 by the radius used in the calculation. MODEL 1 ATOM 1 N THR A 5 -19.727 29.259 13.573 1.64 9.44 ATOM 2 CA THR A 5 -19.209 28.356 14.602 1.88 5.01 ATOM 3 C THR A 5 -18.747 26.968 14.116 1.61 0.40 ...
The output is a PDB-file where the temperature factors have been replaced by SASA values (last column), and occupancy numbers by the radius of each atom (second to last column).
Only the atoms and models used in the calculation will be present in the output (see PDB input for how to modify this).
Calculate the SASA of each residue type:
$ freesasa --format=res 3wbm.pdb # Residue types in 3wbm.pdb RES ALA : 251.57 RES ARG : 2868.98 RES ASN : 1218.87 ... RES A : 1581.57 RES C : 2967.12 RES G : 1955.16 RES U : 1693.68
Calculate the SASA of each residue in the sequence:
$ freesasa --format=seq 3wbm.pdb # Residues in 3wbm.pdb SEQ A 5 THR : 138.48 SEQ A 6 PRO : 25.53 SEQ A 7 THR : 99.42 ...
The CLI can also produce output similar to the RSA format from NACCESS. This format includes both absolute SASA values (ABS) and relative ones (REL) compared to a precalculated reference max value. The only significant difference between FreeSASA's RSA output format and that of NACCESS (except differences in areas due to different atomic radii), is that FreeSASA will print the value "N/A" where NACCESS prints "-99.9".
$ freesasa --format=rsa 3wbm.pdb REM FreeSASA 2.0 REM Absolute and relative SASAs for 3wbm.pdb REM Atomic radii and reference values for relative SASA: ProtOr REM Chains: ABCDXY REM Algorithm: Lee & Richards REM Probe-radius: 1.40 REM Slices: 20 REM RES _ NUM All-atoms Total-Side Main-Chain Non-polar All polar REM ABS REL ABS REL ABS REL ABS REL ABS REL RES THR A 5 138.48 104.1 99.58 107.4 38.90 96.3 81.59 98.1 56.89 114.0 RES PRO A 6 25.53 19.3 11.31 11.0 14.23 47.7 21.67 18.7 3.86 23.9 ... RES GLY A 15 0.64 0.9 0.00 N/A 0.64 0.9 0.00 0.0 0.64 2.0 ... RES U Y 23 165.16 N/A 165.16 N/A 0.00 N/A 52.01 N/A 113.15 N/A RES C Y 24 165.01 N/A 165.01 N/A 0.00 N/A 46.24 N/A 118.77 N/A RES C Y 25 262.46 N/A 262.46 N/A 0.00 N/A 85.93 N/A 176.52 N/A END Absolute sums over single chains surface CHAIN 1 A 3785.5 3062.1 723.3 2051.6 1733.9 CHAIN 2 B 4342.3 3488.6 853.7 2385.2 1957.1 CHAIN 3 C 3961.1 3178.5 782.7 2122.4 1838.7 CHAIN 4 D 4904.3 3926.8 977.5 2572.0 2332.3 CHAIN 5 X 4156.5 4156.5 0.0 1236.9 2919.6 CHAIN 6 Y 4041.1 4041.1 0.0 1184.3 2856.8 END Absolute sums over all chains TOTAL 25190.8 21853.6 3337.2 11552.4 13638.4
Note that each
RES is a single residue, not a residue type as above (i.e. has the same meaning as
SEQ above). This unfortunate confusion of labels is due to RSA support being added much later than the other options. Fixing it now would break the interface, and will thus earliest be dealt with in the next major release.
The reference values for the NACCESS configuration in FreeSASA are not exactly the same as those that ship with NACCESS, but have been calculated from scratch using the tripeptides that ship with FreeSASA. Calling
$ freesasa 3wbm.pdb --format=rsa --radii=naccess
will give an RSA file where the ABS columns should be identical to NACCESS (if the latter is run with the flag
-b). REL values will differ slightly, due to the differences in reference values. NACCESS also gives different results for the nucleic acid main-chain and side-chain (possibly due to a bug in NACCESS?). FreeSASA defines the (deoxy)ribose and phosphate groups as main-chain and the base as side-chain.
--select can be used to define groups of atoms whose integrated SASA we are interested in. It uses a subset of the Pymol
select command syntax, see Selection syntax for full documentation. The following example shows how to calculate the sum of exposed surface areas of all aromatic residues and of the four chains A, B, C and D (just the sum of the areas above).
$ freesasa --select "aromatic, resn phe+tyr+trp+his+pro" --select "abcd, chain A+B+C+D" 3wbm.pdb ... SELECTIONS freesasa: warning: Found no matches to resn 'TRP', typo? freesasa: warning: Found no matches to resn 'HIS', typo? aromatic : 1196.45 abcd : 16993.24
The lines shown above are appended to the regular output. This particular protein did not have any TRP or HIS residues, hence the warnings (written to stderr). The warnings can be supressed with the flag
Calculating the SASA of a given chain or group of chains separately from the rest of the structure, can be useful for measuring how buried a chain is in a given structure. The option
--chain-groups can be used to do such a separate calculation, calling
$ freesasa --chain-groups=ABCD+XY 3wbm.pdb
produces the regular output for the structure 3WBM, but in addition it runs a separate calculation for the chains A, B, C and D as though X and Y aren't in the structure, and vice versa:
PARAMETERS algorithm : Lee & Richards probe-radius : 1.400 threads : 2 slices : 20 #################### INPUT source : 3wbm.pdb chains : ABCDXY atoms : 3714 RESULTS (A^2) Total : 25190.77 Apolar : 11552.38 Polar : 13638.39 CHAIN A : 3785.49 CHAIN B : 4342.33 CHAIN C : 3961.12 CHAIN D : 4904.30 CHAIN X : 4156.46 CHAIN Y : 4041.08 #################### INPUT source : 3wbm.pdb chains : ABCD atoms : 2664 RESULTS (A^2) Total : 18202.78 Apolar : 9799.46 Polar : 8403.32 CHAIN A : 4243.12 CHAIN B : 4595.18 CHAIN C : 4427.11 CHAIN D : 4937.38 #################### INPUT source : 3wbm.pdb chains : XY atoms : 1050 RESULTS (A^2) Total : 9396.28 Apolar : 2743.09 Polar : 6653.19 CHAIN X : 4714.45 CHAIN Y : 4681.83
The user can ask to include hydrogen atoms and HETATM entries in the calculation using the options
--hetatm. In both cases adding unknown atoms will emit a warning for each atom. This can either be amended by using the flag
-w to suppress warnings, or by using a custom classifier so that they are recognized (see Classifier configuration files).
By default FreeSASA guesses the element of an unknown atom and uses that elements VdW radius. If this fails the radius is set to 0 (and hence the atom will not contribute to the calculated area). Users can request to either skip unknown atoms completely (i.e. no guessing) or to halt when unknown atoms are found and exit with an error. This is done with the option
--unknown which takes one of the three arguments
guess (default). Whenever an unknown atom is skipped or its radius is guessed a warning is printed to stderr.
If a PDB file has several chains and/or models, by default all chains of the first model are used, and the rest of the file is ignored. This behavior can be modified using the following options
--join-models: Joins all models in the input into one large structure. Useful for biological assembly files were different locations of the same chain in the oligomer are represented by different MODEL entries.
--separate-models: Calculate SASA separately for each model in the input. Useful when the same file contains several conformations of the same molecule.
--separate-chains: Calculate SASA separately for each chain in the input. Can be joined with
--separate-modelsto calculate SASA of each chain in each model.
--chain-groups: see Analyzing groups of chains