#pair

You can assume that applies for most tokenizers used by LLM currently. Also it's 4 tokens for 3 words on average, so 0.75 word per token. It varies based on the total number of possible tokens, if you have only a few hundreds (letter and numbers for example) then that average would be a lot lower, many token needed for a single word and if you have every single word that exists then the average would be closer to 1. For ChatGpt their vocabulary size is 50k+. Also this number applies only to English, for languages such as Japanese or Chinese the token per word is way higher.