Building (Better) Jet Algorithms for Run II – Workshop Results:
For the full report see - www.physics.niu.edu/~blazey/jet_alg/jet_alg.html
From G.C. Blazey, J.R. Dittmann, S.D. Ellis, V.D. Elvira, K. Frame, S. Grinstein, R. Hirosky, P. Piegaia, H. Schellman, R. Snihur, V. Sorin, D. Zeppenfeld
Here is my brief synopsis:
General Point: Many issues derive from the fact that experimental priorities/needs do not always match theoretical priorities/needs!
“History” starts 10 years ago with the “Snowmass Accord” (or the Snowmass Algorithm). Idea was to have an agreed upon algorithm (hence accord) that everyone would use. But, in practice, it was flawed –
· Was not efficient – experimenters used seeds to limit where one looked for jets – this introduces IR sensitivity at NNLO
· Did not treat issue of overlapping cones – split/merge question
Can miss configurations with widely separated energy that can still fit in a cone but with little energy in between – introduced RSEP – theory parameter to mimic this effect
Note Two, logically distinct, phases to jet finding:
1. Identify contents of jets - partons, particles calorimeter towers – jet algorithm
2.
Add kinematic
properties of jet contents (e.g., 4-vectors) to find jet kinematic properties –
recombination scheme
“Big Picture” goal: do 1% Jet Physics – precision QCD – in Run II,
· for its own sake
· to isolate physics beyond the Standard Model
Requires control of the Jet Energy to better than 1% (rates fall rapidly with the jet energy)
Requires understanding of jet energy calibration to < 1%
Requires understanding of corrections “back to” fixed order perturbation theory to < 1%
Requires “full disclosure” of all jet algorithm details
Strongly Recommends that both collaborations use the same (very similar) jet algorithm
History of HIDDEN issues, all of which influence the result
· Energy Cut on towers kept in analysis (e.g., to avoid noise)
· (Pre)Clustering to find seeds
- Energy Cut on precluster towers
- Energy cut on clusters
- Energy cut on seeds kept
· Starting with seeds find stable cones by iteration
- In JETCLU, “once in a cone, always in a cone”, the “ratchet” or “Velcro” effect
· Overlapping stable cones must be split/merged
- Depends on overlap parameter fmerge
- Order of operations matters
All of these issues impact the content of the “found” jet
- Shape may not be a cone
- Number of towers can differ, i.e., different energy
- Corrections for underlying event must be tower by tower
Can be more important the differences between algorithms, e.g., seeded vs seedless
Need to learn from the past!
Goals of IDEAL ALGORITHM (motherhood)
Fully Specified: including defining in detail any preclustering, merging, and splitting issues
Theoretically Well Behaved: the algorithm should be infrared and collinear safe (and insensitive) with no ad hoc clustering parameters (e.g., RSEP)
Detector Independence: there should be no dependence on cell type, numbers, or size
Order Independence: The algorithms should behave equally at the parton, particle, and detector levels.
Uniformity:
everyone uses the same algorithms (to the best
possible approximation) – Recommendation: 1 legacy cone, 1 seedless
cone, 1 KT algorithm – with E-scheme momentum recombination
Theory –
· Boost invariant results – use variables with appropriate boost properties
· Kinematic boundary stability – use variables with appropriate energy conservation to allow resummation calculations
Þ Use E-Scheme variables
· Minimize resolution smearing and angle biases
· Stability with luminosity – not sensitive to multiple collisions
· Efficient use of computer resources – but do not let this drive problems with physics issues (e.g., seeds and preclustering)
· Easy to calibrate – not so worried about size of corrections as with accuracy of corrections
Cone Algorithm – particles, calorimeter towers, partons in cone of size R, defined in angular space, e.g., Snowmass (h,j)
CONE center - (hC,jC)
CONE i Ì C
iff 
Energy
Centroid 
Jet is defined by “stable” cone:
Stable cones found by iteration: start with cone anywhere (and, in principle, everywhere), calculate the centroid of this cone, put new cone at centroid, iterate until cone stops “flowing”, i.e., stable Þ Proto-jets (prior to split/merge)
Recommended solutions:
· Seedless (but still efficient) cone algorithm – put initial cone at center of every tower
· Legacy cone algorithm with midpoints between seeds as extra seeds (remove most of IR sensitivity and find most configurations)
Compare 3 versions of cone algorithm to test the impact of NOT finding all of the low energy proto-jets in the “flat plains”. These are presumably just local fluctuations and not the footprints of energetic partons.
1. Streamlined Seedless – only keep cones that do NOT “flow” outside of original tower
2. Seedless – keep as preproto-jets only those towers for which centroid is within the original tower then fully process
3. Fully process all cones that start centered on towers in central region (and which do flow outside of this region) – requires separation cut to avoid “over-merging”
Note that leading jet is robust for the cone jets!
Compare leading jets:
|
Algorithm |
Leading ET (GeV) |
Leading Tower # |
2nd Leading ET (GeV) |
2nd Leading Tower # |
|
1 |
25.59 |
156 |
17.32 |
126 |
|
2 |
25.62 |
157 |
16.98 |
113 |
|
3 |
25.62 |
156 |
14.91 |
94 |
|
KT D=0.7 |
28.70 |
151 |
16.89 |
84 |
|
KT D=1.0 |
32.59 |
200 |
26.26 |
203 |
(KT algorithms do not count 0 energy towers)
Streamlined Seedless Algorithm: based on “flowing” idea above – stable cones do NOT flow.
· Put trial cone at center of ever tower (in fiducial volume)
· Calculate centroid for every cone (computer expensive)
· If centroid is
outside of original tower, drop the cone from the analysis – rapidly
convergences to only stable cones –
proto-jets



In any case, stable cones will overlap some – small effect for leading jet (but does systematically reduce energy). So must define merge/split phase –
· process proto-jets in decreasing energy order
· merge if shared energy > f=50% of lower proto-jet energy
· split if shared energy < F=50%, award to “closer” proto-jet
3rd possibility – give up geometric simplicity and the problems with overlap and use a KT algorithm.
· Combine partons, particles or towers pair-wise based on some measure of “closeness”, beginning with low energy first.
· Jet identification is unique – no merge/split stage
· Resulting jets are more amorphous, energy calibration seemed difficult (subtraction for UE?), and analysis can be very computer intensive (time grows like N3)
Reduce number of “particles” by preclustering algorithm – can also deal with negative (and low) tower energies
· BUT must be careful not to introduce biases like the seeds did
· Still some serious work to do here
RECOMMEND:
· Both experiments use legacy (mid-point), seedless and KT algorithms (all three!)
· Use identical versions except for issues required by physical differences – all of this in preclustering??
· Use E-scheme variables for jet algorithm and recombination
So our homework assignment has been:
· Develop “common” code for an improved legacy cone algorithm
· Develop “common” code for a seedless cone algorithm
· Develop “common” code for KT algorithm
· Use E-scheme variables through out
·
Account
for differences in detectors with a “sensible” preclustering scheme
How have we done? Are we done?




