Chocolate, a great human pleasure, can go horribly wrong when given to dogs, Similarly, when tokenization is applied in the context of cloud data protection, it can leave your data extremely vulnerable.
Tokenization is a simple technique of replacing sensitive data with a non-sensitive token and creating a mapping between the token and the original data. The mapping is 1:1 - the same value occurring multiple times is always replaced with the same token. The idea is that if an attacker gets hold of a tokenized data set, they cannot recreate the original data without having access to the mapping.
The technique was made popular by the PCI DSS standard because it works very well for protecting credit card numbers. Statistical attacks are the most common on tokenized data sets but since credit card numbers are unique and uncorrelated, such attacks are ineffective and Tokenization is like chocolate for humans, perhaps even with mythical health benefits.
In contrast, most other data sets that need to be encrypted in the cloud are highly correlated. For example if you applied tokenization to last names of customers of your China division in Salesforce, there are well published statistics that say that Wang is the most popular last name and occurs 7.6% of the time. If you picked the most frequently occurring token in the tokenized data set, you will, with very high probability, determine that the token stands for Wang, without having any access to the mapping.
Many CASB vendors are pushing tokenization as the method for cloud data protection. While chocolate is great for humans it is not good for dogs!