When we start to think about method development so many things come to mind. It’s a great idea to clearly define the goals of a project so that we can plan efforts and resources in such a way as to meet those goals. So, what should be our expectation, or goals, when we use NIR for qualitative methods, particularly raw material identification? First and foremost, we want to develop a method that will pass “good” materials. Here, “good” means that the material is properly labeled and meets the quality standard of the samples used in calibration. More importantly, though, we want our models to fail “bad” materials. Here, “bad” means a material which has been mislabeled or contaminated. It wouldn’t be too bad for our method to be fast and simple, enabling inspection of every incoming container of raw materials. These goals are well within reach using NIR.
False positives and false negatives are missing the mark; the following tips and tricks will hopefully help you avoid some common pitfalls leading to a lack of selectivity and/or robustness.
Let’s now consider the main facets of sample planning.
You use your NIR to collect a spectrum, and you see a graph with peaks and valleys. You should go into a project with a basic understanding of which variables are going to have the greatest influence on the shape (that is, the peak locations and intensities) of that spectrum. Failure to account for (or control) some of those factors may result in a lack of selectivity or robustness. Chemical and physical properties of the sample, and the sampling technique itself, are going to be the biggest contributors to the spectra that you collect.
Chemical properties refers to the actual chemical species, the molecules, present in the sample to absorb NIR light. Molecules are of course made up of atoms, and NIR is very sensitive to bonding between specific atomic pairs (i.e. functional groups), such as carbon and hydrogen, hydrogen and oxygen, nitrogen and hydrogen. These bonds produce peaks at reproducible wavenumbers. However, sometimes the raw materials we receive are not single-component systems. There may be specific excipients, like flow aids, or perhaps residual solvent left over from a drying or granulation process. In these multi-component systems, each individual component of the mixture will contribute specific absorbance peaks, and the proportion of each component will be reflected in peak intensity.
Next, let’s consider the physical properties of a spectrum. While physical properties won’t impact where (i.e which wavenumbers) a sample absorbs at, they may impact the intensity of absorption due to their impact on light scattering throughout the sample. The absorption is typically wavenumber dependent and results in changes to the overall slope of the absorbance curve. The sample’s state of matter- whether solid, semi-solid, or liquid, will impact the spectral shape due to varying degrees of vibrational freedom of those molecular bonds. Particle size, as well as our next factor density, impact how efficiently the sample can scatter the NIR light. Sample grades may be defined by unique particle size ranges, and highly compressible powders show the greatest variation with regard to density. These factors will show up as baseline offsets and/or differences in the overall slope of the spectrum. Temperature will have the greatest impact on liquid systems, but extreme temperature variation can impact spectra of solid state samples, as well.
The last leg of sample planning is the sampling technique itself. Factors such as instrument parameters (e.g. the measurement cell type and spectral averaging), sample uniformity and sample container type should be kept as consistent as possible to avoid introducing new sources of variation unrelated to your raw material identity. Ideally, your sample container should be kept consistent in that if you’re using 8-mm vials for calibration then you should use 8-mm vials for your routine application. However, because many of the sample containers, especially plastic sleeves or plastic cups, Petri dishes and scintillation vials are not 100% transparent in the NIR region, it is also important to capture the expected variation of those containers within the calibration model. That means, if you’re collecting spectra through plastic bags, collect spectra in bags from different lots from your supplier. That way, if there are small variations in the bags (e.g. thickness or polymer content), you aren’t training your NIR method to see those changes as being related to your raw material.
Our typical rule of thumb for qualitative method development is to use a minimum of 5 lots of each material in your NIR projects. As an example, you receive 5 large sample containers (sacks) of a raw material from Vendor A, where each sack is a unique lot. Maybe you have these in-house, maybe you have to wait a few production cycles until you can acquire the recommended 5 lots. In any case, we suggest that you pull 3 samples to be measured by NIR from each bag. Pulling samples from the top, middle and bottom of the sack may provide the greatest amount of spectral variation in things like absorbed moisture and particle size to train your model on. This will pay off in terms of method robustness down the road.
As far as general recommendations for sampling:
- Each raw material should be represented in the calibration by multiple lots; ideally:
- (Minimum of) 5 lots of each raw material
- (Suggested) 3 samples from each lot, stratified sampling, different operators
- (Good practice) 3 NIR measurements of each sample
- Samples should be shaken / tapped / agitated between scans
- Important when lots are limited, data are scarce
- ONLY those materials whose quality has been validated by primary analysis should be used in the calibration work!
- Otherwise – risk adding “bad samples” to your identification method
Calibration project workflow
A typical and recommended calibration project workflow is shown here.
The first step is always going to be to define your calibration groups. Each group has its own NIRCal Project. You’ll want to define groups to maximize NIR selectivity among members of the group. We’ll come back to this idea soon.
The second step is to define your calibration method within the project. For 99% of qualitative projects, the Cluster method in our BUCHI NIRCal Software. This method is ideal at finding the right calibration parameters to discriminate between all of the calibration properties simultaneously. However, if you’re looking at doing something like blend uniformity, the SIMCA method is sufficient. Take note that principal component analysis (PCA) is the basis for both cluster analysis and SIMCA.
The third step is define sets. Rules of thumb are that 2/3 of spectra of each property in the project should be assigned to the calibration set, while the remaining 1/3 should be in validation. We’ll take a closer look at set assignments shortly.
The fourth step is property selection. You’ll want to double check that all of your project properties are of the type “identification” and that they are all assigned to the calibration set.
The first four steps are grouped into what I would call the calibration project “set-up.” These steps all need to be completed before any calibration model work is started, and these things will be mostly fixed until calibration update is needed.
Once the calibration project set-up is complete, it’s time to calculate a model that will adequate differentiate these materials. I highly recommend starting with the NIRCal Calibration Wizard, as it will go through many combinations of pre-processing (e.g. derivatives, normalization) and wavelength selection much more quickly than anyone manning a keyboard and mouse. Once a calibration model has been calculated which produces adequate results – that is, produces a project where all materials are correctly identified, you may consider using the calibration in a routine application.
There may be a need to add additional calibration samples to your project. One instance is where you actually “fail” an ID test, then submit that sample for analysis by a primary method. If that raw material passes the primary method, then there was some source of spectral variation that has not yet been seen by the calibration project. You’ll want to loop back to Step 3, adding the new and validated spectra to your calibration set, then recalculate the model and adjust your routine application.
If you decide to add a new property to a project, you’ll want to loop back to Step 1.
Let’s come back to the idea of grouping. Consider you have two materials, and hence 2 properties, in a project – material A and Material B. Your NIR calibration is going to try to maximize the differences between these two materials. However, if we take a closer look at Material A or Material B, we see that they aren’t all exactly the same (in the example below, you see slight color variations).The within-property differences may be something like variations in the composition of flow aids or residual moisture from a humid summer day, or they might be acceptable differences in particle size distribution. In any case, the model now not only has to maximize the differences between materials A and B, but also minimize the differences between all of the samples assigned to property A (or B, respectively). Then of course, if we have a new material which is intermediate in its chemical characteristics to materials A and B (i.e. Material C), the model needs to fail that material for either Material A or B.
In order to have the best discrimination between chemically-similar compounds, they should be grouped in the same project. This is the best way at avoiding false positives and the most efficient way to develop calibrations. Here are some example groupings to consider.
- Amino acids
- Organic salts
- Animal by-products
Let’s consider calibration and validation set selection for lots of raw materials we have in the warehouse. In the image below, I’ve made calibration and validation sample assignments indicated by the blue and yellow circles, respectively.
Note the approximate 2/3 to 1/3 set assignments and also note that each sack, which represents a single lot, is designated as either calibration or validation. It is very important for robustness to avoid redundancy between calibration and validation sets.
If lots from a second (or third or fourth) vendor are available, I would start by adding the first available lot to calibration, as the expectation is that those lots would have some uniqueness to Vendor A. Then, the remainder of the lots could be randomized to either calibration or validation.
What if a routine measurement fails?
- Check that sample (or probe) is positioned properly during the measurement
- Check that the optical path (window, sample container) is clean and retry measurement
- Check that a good reference was collected and that reference material is clean
- Sample may have new variation that wasn’t used in the calibration training set (e.g. higher moisture content due to seasonal humidity, new vendor with different particle size or flow aid)
- Send this sample for primary analysis; if pass, add to the calibration data set and recalculate the calibration model
FAQ: What if a false positive ID happens?
- Check if sample was mislabeled
- Consider re-doing the calibration that includes the sample property
- If the false positive was for a sample in a different calibration group (e.g. during cross-group validation), combine those samples in the same group (i.e. NIRCal Project) and recalibrate
- If the false positive was from the same calibration group:
- Try new pre-processing
- Try tighter tolerances (e.g. reduce Radii or residual blow-up limit)
- Try removing a dissimilar substance in the calibration group to increase selectivity to more chemically similar substances
- Send sample for primary analysis
BUCHI offers on-site, in-house and remote training opportunities to dig deeper into these concepts and work hand-in-hand with experts in calibration software. Contact us at firstname.lastname@example.org to set up an appointment.