Building (Better) Jet Algorithms for Run II – Workshop Results:


   For the full report see -


From G.C. Blazey, J.R. Dittmann, S.D. Ellis, V.D. Elvira, K. Frame, S. Grinstein, R. Hirosky, P. Piegaia, H. Schellman, R. Snihur, V. Sorin, D. Zeppenfeld


Here is my brief synopsis: 


General Point: Many issues derive from the fact that experimental priorities/needs do not always match theoretical priorities/needs!


“History” starts 10 years ago with the “Snowmass Accord” (or the Snowmass Algorithm).  Idea was to have an agreed upon algorithm (hence accord) that everyone would use.  But, in practice, it was flawed –

·       Was not efficient – experimenters used seeds to limit where one looked for jets – this introduces IR sensitivity at NNLO

·       Did not treat issue of overlapping cones – split/merge question


Can miss configurations with widely separated energy that can still fit in a cone but with little energy in between – introduced RSEP – theory parameter to mimic this effect


Note Two, logically distinct, phases to jet finding:

1.    Identify contents of jets - partons, particles calorimeter towers – jet algorithm

2.    Add kinematic properties of jet contents (e.g., 4-vectors) to find jet kinematic properties – recombination scheme



“Big Picture” goal:  do 1% Jet Physics – precision QCD – in Run II,


·       for its own sake


·       to isolate physics beyond the Standard Model



Requires control of the Jet Energy to better than 1% (rates fall rapidly with the jet energy)

Requires understanding of jet energy calibration to < 1%


Requires understanding of corrections “back to” fixed order perturbation theory to < 1%


Requires “full disclosure” of all jet algorithm details

Strongly Recommends that both collaborations use the same (very similar) jet algorithm

History of HIDDEN issues, all of which influence the result


·       Energy Cut on towers kept in analysis (e.g., to avoid noise)

·       (Pre)Clustering to find seeds


-         Energy Cut on precluster towers

-         Energy cut on clusters

-         Energy cut on seeds kept


·       Starting with seeds find stable cones by iteration

-         In JETCLU, “once in a cone, always in a cone”, the “ratchet” or “Velcro” effect


·       Overlapping stable cones must be split/merged


-         Depends on overlap parameter fmerge

-         Order of operations matters

All of these issues impact the content of the “found” jet


-         Shape may not be a cone

-         Number of towers can differ, i.e., different energy

-         Corrections for underlying event must be tower by tower



Can be more important the differences between algorithms, e.g.,  seeded vs seedless



Need to learn from the past!

Goals of IDEAL ALGORITHM   (motherhood)

Fully Specified:  including defining in detail any preclustering, merging, and splitting issues

Theoretically Well Behaved:  the algorithm should be infrared and collinear safe (and insensitive) with no ad hoc clustering parameters (e.g., RSEP)

Detector Independence:  there should be no dependence on cell type, numbers, or size

Order Independence: The algorithms should behave equally at the parton, particle, and detector levels.

Uniformityeveryone uses the same algorithms (to the best possible approximation) – Recommendation: 1 legacy cone, 1 seedless cone, 1 KT algorithm – with E-scheme momentum recombination


Theory –

·       Boost invariant results – use variables with appropriate boost properties

·       Kinematic boundary stability – use variables with appropriate energy conservation to allow resummation calculations

Þ      Use E-Scheme variables

Experiment –

·       Minimize resolution smearing and angle biases

·       Stability with luminosity – not sensitive to multiple collisions

·       Efficient use of computer resources – but do not let this drive problems with physics issues (e.g., seeds and preclustering)

·       Easy to calibrate – not so worried about size of corrections as with accuracy of corrections

Cone Algorithm – particles, calorimeter towers, partons in cone of size R, defined in angular space, e.g., Snowmass (h,j)

CONE center - (hC,jC)

CONE  i Ì C iff   



“Flow vector”  


Jet is defined by “stable” cone:



Stable cones found by iteration:  start with cone anywhere (and, in principle, everywhere), calculate the centroid of this cone, put new cone at centroid, iterate until cone stops “flowing”, i.e., stable Þ Proto-jets (prior to split/merge)


Theoretically can look “everywhere” and find all stable cones

Experimentally reduce size of analysis by putting initial cones only at seeds – energetic towers or clusters of towers – thus introducing undesirable IR sensitivity and missing certain possible 2-jets-in-1 configurations


Recommended solutions:


·       Seedless (but still efficient) cone algorithm – put initial cone at center of every tower

·       Legacy cone algorithm with midpoints between seeds as extra seeds (remove most of IR sensitivity and find most configurations)




Compare 3 versions of cone algorithm to test the impact of NOT finding all of the low energy proto-jets in the “flat plains”.  These are presumably just local fluctuations and not the footprints of energetic partons.

1.  Streamlined Seedless – only keep cones that do NOT “flow” outside of original tower

2.  Seedless – keep as preproto-jets only those towers for which centroid is within the original tower then fully process

3.  Fully process all cones that start centered on towers in central region (and which do flow outside of this region) – requires separation cut to avoid “over-merging”


Note that leading jet is robust for the cone jets!


Compare leading jets:


Leading ET (GeV)

Leading Tower #

2nd Leading ET (GeV)

2nd Leading Tower #
















KT D=0.7





KT D=1.0





(KT algorithms do not count 0 energy towers)

Streamlined Seedless Algorithm: based on “flowing” idea above – stable cones do NOT flow.

·       Put trial cone at center of ever tower (in fiducial volume)

·       Calculate centroid for every cone (computer expensive)

·       If centroid is outside of original tower, drop the cone from the analysis – rapidly convergences to only stable cones –




E-scheme – (for single partons, particles, towers )

 CONE  i Ì C iff   






In any case, stable cones will overlap some – small effect for leading jet (but does systematically reduce energy).  So must define merge/split phase –

·       process proto-jets in decreasing energy order

·       merge if shared energy > f=50% of lower proto-jet energy

·       split if shared energy < F=50%, award to “closer” proto-jet


3rd possibility – give up geometric simplicity and the problems with overlap and use a KT algorithm. 

·       Combine partons, particles or towers pair-wise based on some measure of “closeness”, beginning with low energy first. 

·       Jet identification is unique – no merge/split stage

·       Resulting jets are more amorphous, energy calibration seemed difficult (subtraction for UE?), and analysis can be very computer intensive (time grows like N3)


Reduce number of “particles” by preclustering algorithm – can also deal with negative (and low) tower energies

·       BUT must be careful not to introduce biases like the seeds did

·       Still some serious work to do here



·       Both experiments use legacy (mid-point), seedless and KT algorithms (all three!)

·       Use identical versions except for issues required by physical differences – all of this in preclustering??

·       Use E-scheme variables for jet algorithm and recombination




So our homework assignment has been:


·       Develop “common” code for an improved legacy cone algorithm

·       Develop “common” code for a seedless cone algorithm

·       Develop “common” code for KT algorithm

·       Use E-scheme variables through out

·       Account for differences in detectors with a “sensible” preclustering scheme

How have we done?         Are we done?