There’s a surprising amount of post-excitement fear, uncertainty and doubt being directed at DeepSeek. I’ve seen posts that say that open sourcing the model creates a security hole, posts that question the sources of DeepSeek’s training data (I won’t cover this here, but this is absolutely a concern), posts that point out DeepSeek’s T&C’s give them perpetual usage of your prompts and personal data.

Are these fears valid? Except for the open source one (I can’t believe we still worry about open source being less secure, haven’t heard this one for years), yeah, they’re valid fears. The issue is, you shouldn’t be worried about just DeepSeek, you should ask all of these same questions of any hosted LLM service. You should ask the same questions of any system you don’t have full control over. If you’re not paying for the product, you are the product. Well…sort of. I still strongly believe in open source, which typically allows you to run something “for free” on the hardware of your choice. The difference between open source and something that’s vaguely “free” is that real open source publishes a license which describes the terms of use of the project.

“But, DeepSeek is open source?” Yes, parts of DeepSeek are open source. In the ongoing discussion around LLMs and privacy we tend to talk about an LLM as the combination of the actual LLM “brain” (model) and the runtime of the LLM. DeepSeek has 16 open source repositories on Github and 13 collections on huggingface. They also published papers on their methodology which others can learn from. All of these things are positive and follow a pattern set by Meta when they open sourced their Llama model. However, just like Meta, there’s a difference between open sourcing a model and the runtime you choose to interact with that model.

If you have enough hardware, you can run Meta’s Llama model or DeepSeek’s R1 model yourself using a runtime like Ollama or LM Studio. Now you’re sharing your information with Ollama or LM Studio, so you need to ask yourself if they can be trusted with whatever data you’re giving them but (unless they’re doing something truly nefarious) none of your data is being passed back to the model authors.

This split between the LLM and the runtime that allows you to interact with the LLM is going to be a theme in future articles. While I’m completely blown away by the rapidly advancing abilities of LLMs, I’m more interested in the runtimes we’re wrapping around our LLM “brains”. AI agents and chatbots will give you different results when you swap out the LLM that drives them but what an agent or chatbot can accomplish is equally dictated by the choice of runtime for the agent or chatbot.

Circling back on the misplaced worries over DeepSeek, should you be worried DeepSeek’s R1 model is stealing your data? No. Should you be worried that DeepSeek’s free appstore app that connects to R1 is stealing your data? Within reason, yes. Always interact with any application (web, mobile, desktop) with a bit of skepticism. However - if you’re not trying out new AI models as quickly as they’re being released, you’re missing out on a tremendously exciting world. Healthy skepticism, always a good idea…letting fear get in the way of an opportunity to grow? That’s a mistake.