AI vs. critical thinking in cybersecurity. Reflections from a SOC specialist

Large language models often generate answers that sound credible but contain factual errors or unverified assumptions—especially when the user does not provide precise domain context. Research shows that the scale of such inaccuracies can reach even several dozen percent of responses. In cybersecurity, this is not merely a “worse answer,” but a real risk of drawing the wrong operational conclusion, which can lead to a bad decision, a missed incident, or an escalation of the situation. AI without expert validation creates an illusion of analysis, not real security support.

In independent analyses, more than 45% of answers generated by large language models contain at least one material inaccuracy, and academic research on hallucinations indicates that the error rate… This means AI’s technical answers can include serious operational mistakes if they are not critically verified by an expert.

As a cybersecurity practitioner and Deputy SOC360 Director at 4Prime, I observe how artificial intelligence (AI) is changing our day-to-day work. Its impact on our field is one of the hottest topics—and for good reason. The promise of automation, lightning-fast analysis, and autonomous active security systems is tempting: greater resilience and lower costs. But from my perspective, behind the facade of marketing enthusiasm lies a complex reality. I see how uncritical use of AI leads to “madness”—an illusion of security—while the real shield I rely on remains experience and critical thinking. In this article, I want to share my reflections on the benefits and risks of using AI in cybersecurity.

Ambiguous support for analysts: LLM chatbots

Tools based on large language models (LLMs), such as ChatGPT, have become ubiquitous. In theory, they are meant to help Security Operations Center (SOC) analysts process information faster. In practice, however, their impact on day-to-day work and professional development—especially for junior specialists—is deeply concerning to me.

Negative impact on self-development

The main threat I see is the erosion of fundamental analytical skills. Instead of learning how to search, read, and interpret raw data independently—analyzing email headers or manually reviewing scripts—a junior analyst simply asks a chatbot. They get a ready-made, often simplified answer that skips the entire cognitive process. It is a shortcut that does not build competence. Instead of developing their own “analytical muscles,” they rely on an external “exoskeleton” that is not always reliable.

Low reliability and the hallucination phenomenon

We must remember that LLMs do not “know”—they generate text based on statistical probability. Their goal is to produce a coherent and convincing answer, not a verified truth. This leads to “hallucinations,” which everyone has experienced more than once—AI confidently generates false information. In cybersecurity, this is extremely dangerous. A chatbot may invent a non-existent vulnerability, misinterpret a PowerShell command as harmless, or suggest a remediation procedure that actually worsens the situation. My favorite example is a vendor chatbot that—asked to summarize network traffic volume per country—added up port numbers it interpreted as transferred megabytes. Absolute nonsense.

Lack of a proper cognitive process

An experienced analyst conducting research applies a structured methodology: formulates a hypothesis, looks for primary sources (technical documentation, forum threads), correlates data across multiple systems, verifies findings, and draws conclusions based on evidence and indicators. An LLM chatbot does not do that. Its “research” is a rapid pattern match over training data. There is no critical source evaluation, no context, no nuance. The output is an illusion of analysis, not analysis itself—which is unacceptable to me.

The need for expert validation

Paradoxically, using LLM chatbots safely and effectively requires enough knowledge and awareness to validate what they produce. As a SOC analyst, I can use an LLM to speed up work—for example, to generate a code snippet—because I can immediately assess correctness and safety. For a junior analyst who lacks these verification capabilities, the chatbot becomes an “oracle,” and its false answers can lead to catastrophic mistakes.

I do want to emphasize that chatbots are excellent for developer work—provided we acknowledge they perform best when building single, well-defined program or script components. I consider them a brilliant tool for generating baseline code, writing simple functions, or translating code fragments between languages. They do not fully replace a developer or engineer in designing complex systems, because they lack understanding of architecture, context, and broader project dependencies.

AI in detection mechanisms: an arms race

For years, AI has been an integral part of advanced security systems we use daily, such as NDR (Network Detection and Response) traffic analyzers or EDR (Endpoint Detection and Response) platforms. Its application here also comes with challenges, but I admit it has clear advantages.

High false-positive rate

The biggest curse of AI-based detection mechanisms, in my view, is the high false-positive rate. Machine learning models are trained on standardized, generic datasets. But every organization’s IT environment is unique—it has custom software, non-standard admin scripts, and specific configurations. Without that context, an AI model often flags legitimate but unusual activity as malicious. This leads to “alert fatigue” in my team, which must spend valuable time validating false alarms. AI-driven mechanisms are often presented by vendors as a “black box,” with undocumented operating parameters, which makes tuning and tailoring the system to a given organization challenging.

AI in the hands of cybercriminals

AI is a neutral tool, and it is widely used by all sides of cyber conflict. Criminals use AI to build polymorphic threats faster. Such malware can dynamically modify its code and behavior with each infection, effectively evading signature-based detection and simple heuristics. AI also helps automate reconnaissance, find security weaknesses, and generate highly convincing, personalized phishing messages at scale. This fuels the continuous arms race I take part in every day.

Static and behavioral analysis

That said, I am not completely “anti-AI.”

In static file analysis, the AI models I work with are trained on millions of malicious and benign samples. As a result, they learn to recognize patterns typical of malware—even when no specific signature exists. AI can identify suspicious instruction sequences or unusual file structures. In behavioral analysis, AI models “normal” system and application behavior. When a detection engine (e.g., in an EDR) observes an anomaly—such as an unusual process attempting to access another process’s memory, or an office application launching PowerShell—AI flags it as a potential incident indicator. This is crucial for detecting advanced threats, including fileless attacks.

New generations of analytic tools go a step further, significantly improving my work. Instead of writing complex queries in languages like KQL or PowerQuery to search massive telemetry databases, I can now ask questions in natural language. For example: “Show me all failed login attempts to administrative accounts from outside Poland in the last 48 hours.” AI translates this into a precise query and returns results immediately. This is not only a time saver, but also a major aid when we start operating a new EDR or NDR system.

Moreover, when I get results as a huge table with hundreds of rows, I can give AI a follow-up task: “Summarize this data for me, identify the most frequent IP addresses, usernames, and unusual activity times.” In response, I get a concise, readable text report that immediately highlights the key points for deeper investigation. It helps me grasp the essence of the problem quickly.

The human factor above the madness

To me, the “madness” is believing that AI will soon replace cybersecurity experts. The real value of AI I see is not in its autonomy, but in its ability to augment human capabilities. It is a powerful tool that, in experienced hands, can automate tedious tasks, process huge volumes of data, and… generate text.

The final decision, contextual interpretation, intuition built on experience, and critical thinking remain human domains. Critical reasoning is essential to manage AI’s “madness”: verify outputs, calibrate models, distinguish false alarms from real threats, and—most importantly—keep learning and improving. I believe the future of cybersecurity is not “AI versus humans,” but treating AI as an additional tool in our arsenal.