Taking Deepseek for a testdrive: so what's the status of Taiwan?
A technical exploration of DeepSeek's content moderation and reasoning capabilities through testing its responses to sensitive geopolitical questions about Taiwan's political status. I discover that prompting DeepSeek in Dutch triggers the model's reasoning mechanism, revealing internal moderation instructions. By using linguistic workarounds like misspellings that exploit tokenization differences, I expose DeepSeek's internal guidelines for handling Taiwan-related questions, which include instructions to balance factual information with political sensitivity while avoiding language implying support for independence. This demonstrates that censorship occurs at the application level rather than within the model itself, similar to how ChatGPT handles restricted content. The model's reasoning shows it is capable of nuanced discussion about sensitive geopolitical topics, but the application layer prevents users from accessing these capabilities. I plan follow-up testing via API access, which may operate under different moderation policies than the web application, to further investigate the technical architecture of AI content moderation.
youtu.be β (opens in new window)