Language models can improve physics measurements with improved tau reconstruction

In order to find rare processes from collider data, scientists use computer algorithms to determine the type and properties of particles based on the faint signals that they leave in the detector. One such particle is the tau lepton, which is produced, for example, in the decay of the Higgs boson.
The tau lepton leaves a spray or jet of low-energy particles, the subtle pattern of which in the jet allows one to distinguish them from jets produced by other particles. The jet also contains information about the energy of the tau lepton, which is distributed among the daughter particles, and on the way is decayed. Currently, the best algorithms use multiple steps of combinatorics and computer vision.
ChatGPT has shown much stronger performance in rejecting backgrounds than computer-vision based methods. In this paper, researchers showed that such language-based models can find the tau leptons from the jet patterns, and also determine the energy and decay properties more accurately than before.
This can be done by treating the jet of particles as a sentence, where each word corresponds to a particle, and finding the relations between the particles using the transformer algorithm. Such approaches are promising because they could significantly improve the signal-to-background ratio in future analyses involving the tau lepton, such as the search for double-Higgs production.
The work is in the journal Computer Âé¶¹ÒùÔºics Communications.
More information: Laurits Tani et al, A unified machine learning approach for reconstructing hadronically decaying tau leptons, Computer Âé¶¹ÒùÔºics Communications (2024).
Provided by Estonian Research Council