The first part we’ll talk about is data structures. And today I want to zoom in on set and list: when we should use sets and when we should use lists in automation scripts.
An example: you want to collect unique asset numbers from multiassetlocci.
A common approach is to use a list of asset numbers. Functionally fine until you run it on a record with 20000 assets and you’re surprised how slow it is.
assetnums = []
for row in multiassetset:
asset = row.getString("ASSETNUM")
if asset and asset not in assetnums:
assetnums.append(asset)
What’s happening here?
asset not in assetnums does a scan each time, that means O(k) comparisons when the list already holds k items.
You do that check for every row (n rows) → O(n²) total work.
With 20,000 items that’s 200 million string comparisons! That’s why it feels mysteriously slow.
It would be better and faster if you used a set instead of a list. When you use a set, you don’t need to take care of duplicates, it’s already done for you. There won’t be duplicates in a set, and you have a constant lookup time. So, when you take an asset number from the set, you’re pretty quick.
assetnums = set()
for row in multiassetset:
asset = row.getString("ASSETNUM")
if asset:
assetnums.add(asset)
Just remember:
- Use set() when you only need unique values or fast membership checks.
- Use list() when order or index access matters.
Ever noticed the difference yourself? Or do you have a script that felt just a bit too slow for no reason?
I would be happy to assist: info@annacode.nl
