The paper clearly states that knowledge distillation was used, and it doesn't claim to be the original base model. At least it proves that the path of model compression can be viable, but it insists on elevating it to a political issue.