Getting Started with pyscal#

pyscal is a Python library for computing structural descriptors of atomistic systems. It uses C++ under the hood for performance, and works directly with ASE (Atomic Simulation Environment) Atoms objects.

Key Concepts#

  1. ASE Atoms — All structures are standard ASE Atoms objects. You can create them with ASE, read them from files, or use pyscal’s built-in structure creators.

  2. Functional API — All pyscal functions take an Atoms object as the first argument and return results (often also stored back on the Atoms for later use).

  3. Neighbors first — Most descriptors require neighbor information. Call pyscal.find_neighbors() before computing descriptors.

Installation#

pip install pyscal3

This installs pyscal along with its dependencies: numpy, ase, and pyyaml.

import pyscal
print(f"pyscal version: {pyscal.__version__}")
pyscal version: 4.0.0

Your First Calculation#

Let’s compute Steinhardt bond-order parameters for an FCC copper structure.

We’ll use ASE’s bulk() to create the structure, then pyscal’s functional API to compute descriptors.

from ase.build import bulk
import numpy as np

# Create an FCC copper supercell (4x4x4 conventional cells = 256 atoms)
atoms = bulk("Cu", "fcc", cubic=True).repeat(4)
print(f"Structure: {len(atoms)} atoms")
print(f"Cell dimensions: {atoms.cell.lengths()}")
Structure: 256 atoms
Cell dimensions: [14.44 14.44 14.44]
# Step 1: Find neighbors (required before computing descriptors)
pyscal.find_neighbors(atoms, method="cutoff", cutoff=0)  # cutoff=0 -> adaptive

# Step 2: Compute Steinhardt parameters q4 and q6
q = pyscal.steinhardt_parameter(atoms, l=[4, 6])

print(f"q4: mean = {q[0].mean():.4f}, std = {q[0].std():.6f}")
print(f"q6: mean = {q[1].mean():.4f}, std = {q[1].std():.6f}")
q4: mean = 0.1909, std = 0.000000
q6: mean = 0.5745, std = 0.000000

For a perfect FCC crystal, you’ll see characteristic values of \(q_4 \approx 0.19\) and \(q_6 \approx 0.57\).

How Results Are Stored#

pyscal functions return results directly and store them on the ASE Atoms object for later use:

  • Per-atom arrays: stored in atoms.arrays["pyscal_<name>"] (fixed-size NumPy arrays)

  • Global / ragged data: stored in atoms.info["pyscal_<name>"] (lists, scalars, etc.)

# Results stored on the atoms object
print("Per-atom arrays (atoms.arrays):")
for key in sorted(atoms.arrays):
    if key.startswith("pyscal_"):
        print(f"  {key}: shape {atoms.arrays[key].shape}")

print("\nGlobal / ragged data (atoms.info):")
for key in sorted(atoms.info):
    if key.startswith("pyscal_"):
        val = atoms.info[key]
        if isinstance(val, list):
            print(f"  {key}: list of {len(val)} items")
        else:
            print(f"  {key}: {val}")
Per-atom arrays (atoms.arrays):
  pyscal_cutoff: shape (256,)
  pyscal_diff: shape (256, 12, 3)
  pyscal_neighbordist: shape (256, 12)
  pyscal_neighbors: shape (256, 12)
  pyscal_neighborweight: shape (256, 12)
  pyscal_phi: shape (256, 12)
  pyscal_q4: shape (256,)
  pyscal_q4_imag: shape (256, 9)
  pyscal_q4_real: shape (256, 9)
  pyscal_q6: shape (256,)
  pyscal_q6_imag: shape (256, 13)
  pyscal_q6_real: shape (256, 13)
  pyscal_r: shape (256, 12)
  pyscal_temp_neighbordist: shape (256, 42)
  pyscal_temp_neighbors: shape (256, 42)
  pyscal_theta: shape (256, 12)

Global / ragged data (atoms.info):
  pyscal_neighbor_method: cutoff
  pyscal_neighbors_found: True

Reading Structures from Files#

pyscal works with any structure that ASE can read — LAMMPS dumps, POSCAR, CIF, XYZ, and many more.

from ase.io import read

# Read a LAMMPS dump file
atoms_file = read("conf.fcc.dump", format="lammps-dump-text")
print(f"Loaded {len(atoms_file)} atoms from LAMMPS dump")

# Compute a descriptor directly
cna = pyscal.common_neighbor_analysis(atoms_file)
print(f"CNA: {cna}")
Loaded 1008 atoms from LAMMPS dump
CNA: {'others': 439, 'fcc': 564, 'hcp': 0, 'bcc': 5, 'ico': 0}

Using pyscal’s Structure Creators#

pyscal includes its own structure creation utilities for common crystal types and elements.

from pyscal.structures import make_crystal, make_element, available_structures, available_elements

print(f"Available structures: {available_structures()}")
print(f"Available elements (first 10): {available_elements()[:10]}...")
Available structures: ['simple_cubic', 'bcc', 'fcc', 'hcp', 'dhcp', 'diamond', 'a15', 'l12', 'b2']
Available elements (first 10): ['Ac', 'Ag', 'Al', 'Ar', 'Au', 'Ba', 'Be', 'Ca', 'Cd', 'Ce']...
# Create a BCC structure with a given lattice constant
bcc = make_crystal("bcc", lattice_constant=2.87, repetitions=(4, 4, 4))
print(f"BCC: {len(bcc)} atoms")

# Or create by element name (lattice constant looked up automatically)
fe = make_element("Fe", repetitions=(4, 4, 4))
print(f"Fe: {len(fe)} atoms")

cna = pyscal.common_neighbor_analysis(fe)
print(f"CNA: {cna}")
BCC: 128 atoms
Fe: 128 atoms
CNA: {'others': 0, 'fcc': 0, 'hcp': 0, 'bcc': 128, 'ico': 0}

The Typical Workflow#

Most pyscal analyses follow this pattern:

import pyscal
from ase.build import bulk

# 1. Get a structure (ASE Atoms)
atoms = bulk("Cu", "fcc", cubic=True).repeat(4)

# 2. Find neighbors
pyscal.find_neighbors(atoms, method="cutoff", cutoff=0)

# 3. Compute descriptors
q = pyscal.steinhardt_parameter(atoms, l=[4, 6])

# 4. Access results
q6_values = atoms.arrays["pyscal_q6"]  # per-atom NumPy array

Some functions (like common_neighbor_analysis and centrosymmetry) handle neighbor finding internally, so step 2 can sometimes be skipped.

API Reference#

Function

Description

find_neighbors()

Find atomic neighbors (cutoff, adaptive, Voronoi, by number)

get_distance()

PBC-aware distance between two positions

steinhardt_parameter()

Bond-order parameters \(q_l\)

disorder()

Disorder parameter based on Steinhardt

common_neighbor_analysis()

CNA / adaptive CNA

diamond_structure()

Diamond structure identification

centrosymmetry()

Centrosymmetry parameter

voronoi_vector()

Voronoi structure fingerprint \((n_3, n_4, n_5, n_6)\)

entropy()

Entropy parameter for solid/liquid distinction

short_range_order()

Warren-Cowley SRO parameter

radial_distribution_function()

Radial distribution function \(g(r)\)

angular_criteria()

Angular parameter for diamond detection

chi_params()

Chi-parameter vector

find_solids()

Solid/liquid classification

find_clusters()

Cluster analysis

average_over_neighbors()

Average any property over neighbors

Trajectory

Lazy trajectory reader for LAMMPS dumps