New Scholar Series · 28 April 2026 Working with AI Research Agents to Engage with Academic Literature

Thinking
through AI

Literature review as scholarly practice.
Imperial College Business School
Xule Lin
Research Associate
Imperial College London
Incoming Assistant Professor
SKEMA Business School, Paris
Two ways to get this wrong

Abdication

Mode A

Let AI think for you.
Accept the outputs uncritically.
Efficient — and epistemically hollow.


Mode B

Refuse to engage.
Wait it out.
Safe — and increasingly untenable.

Abnegation

There is a path between.

Two ways to get this wrong

Abdication

Mode A

Let AI think for you.
Accept the outputs uncritically.
Efficient — and epistemically hollow.


Mode B

Refuse to engage.
Wait it out.
Safe — and increasingly untenable.

Abnegation

There is a path between.

Three orientations

Three principles

01

Thinking through.

AI is a medium for developing understanding — not a machine that automates it for you.

02

Engagement design.

The skill is in how you structure the interaction — not which button you press.

03

The inward lens.

When the output disappoints, examine your input first.

We'll come back to them.

Three orientations

Three principles

01

Thinking through.

AI is a medium for developing understanding — not a machine that automates it for you.

02

Engagement design.

The skill is in how you structure the interaction — not which button you press.

03

The inward lens.

When the output disappoints, examine your input first.

We'll come back to them.

Three orientations

Three principles

01

Thinking through.

AI is a medium for developing understanding — not a machine that automates it for you.

02

Engagement design.

The skill is in how you structure the interaction — not which button you press.

03

The inward lens.

When the output disappoints, examine your input first.

We'll come back to them.

What we'll do
The starting question

When management researchers run audits, evaluations, or experiments on LLMs — what methods do they use, and what kinds of claims do they make?

We'll use AI to interrogate how our field engages with AI.
The reflexive move is the point.

01
Conversational AI
Claude · Kimi · ChatGPT
02
Deep research
Claude · ChatGPT · Kimi Agent Swarm
03
Agentic workflow
Claude Code
What we'll do
The starting question

When management researchers run audits, evaluations, or experiments on LLMs — what methods do they use, and what kinds of claims do they make?

We'll use AI to interrogate how our field engages with AI.
The reflexive move is the point.

01
Conversational AI
Claude · Kimi · ChatGPT
02
Deep research
Claude · ChatGPT · Kimi Agent Swarm
03
Agentic workflow
Claude Code
What we'll do
The starting question

When management researchers run audits, evaluations, or experiments on LLMs — what methods do they use, and what kinds of claims do they make?

We'll use AI to interrogate how our field engages with AI.
The reflexive move is the point.

01
Conversational AI
Claude · Kimi · ChatGPT
02
Deep research
Claude · ChatGPT · Kimi Agent Swarm
03
Agentic workflow
Claude Code
What we'll do
The starting question

When management researchers run audits, evaluations, or experiments on LLMs — what methods do they use, and what kinds of claims do they make?

We'll use AI to interrogate how our field engages with AI.
The reflexive move is the point.

01
Conversational AI
Claude · Kimi · ChatGPT
02
Deep research
Claude · ChatGPT · Kimi Agent Swarm
03
Agentic workflow
Claude Code
01
Configuration one

The Socratic
conversation

Thinking with AI as a colleague.

Claude · Kimi · ChatGPT
Move 01 · Articulate before you tool
You → Claude

"When management researchers run audits, evaluations, or experiments on LLMs, what methods do they use and what kinds of claims do they make based on those methods?"

Principle 01
Thinking through

The act of writing this prompt forced me to know what I was actually looking for.
The clarity achieved through the prompt is the insight.

Move 02a · The first answer is rarely the answer
Claude → You

"Here are five dominant approaches: behavioral audits, decision-support deployment studies, ethnographic accounts, survey-based attitudes, and case studies of adoption…"

Competent.
Useful.
Don't accept it, yet.

Move 02 · the meta turn

Step back.
Read what just happened.

Main thread

Q1 → five canonical methods. Competent. Flat.

← where we just were
New session · You → Claude

"Read this conversation. Why isn't it getting at what I'm actually trying to understand?"

↓ paste the transcript
Principle 03
The inward lens

Don't argue with the answer.
Bring the transcript into a new conversation, and ask the conversation what your prompt was carrying.

Move 02 · what the new session surfaced

What my prompt
didn't carry.

What the prompt asked

Methods researchers use on LLMs — audits, evaluations, experiments.
A literature about using LLMs in research.

What I actually wanted

How researchers treat LLMs as theoretical objects — evaluative cognition, compliance modes, fairness reasoning.
A literature about what LLMs are like.

The gap

Studies make claims about what LLMs are like — but the methods often can't license those claims.
The construct-method matching gap.

The conversation didn't fail. The frame did.

Move 03b · Sharpen the frame
You → Claude

"When management researchers treat LLMs as theoretical objects of study — making claims about what LLMs are like as evaluators, decision-makers, reasoners, or cognitive entities — what evidence do they use to support those claims?"

Principle 03
The inward lens

Resist the instinct to blame the model (e.g., not smart, biased).
Observe whether the output mirrored the input.
Notice what the frame was.

Configuration 01 · Claude
Demo
Claude · Configuration 01 conversation
Configuration 01 · ChatGPT
Demo
ChatGPT · Configuration 01 conversation
Configuration 01 · Kimi
Demo
Kimi · Configuration 01 conversation
Configuration 01 · synthesis across models

Three reads. One pattern.
Still not done.

What the agents gave us
  • Four research postures (evaluators, decision-makers, silicon samples, psychological entities)
  • Behavioral evidence dominates — outputs matching outputs
  • Critical countercurrent surfacing (Lin 2025; dual-validity; Lindebaum & Fleming)
What we still need to push on
  • Why do researchers settle on one account of LLM internals over others?
  • What work is borrowed theoretical vocabulary doing?
  • The map is here. The mechanism isn't. → Configuration 02
Configuration 01 · Takeaway

The transcript is
the work.

What's visible
  • Articulation of what you actually want.
  • Conversational steering, diagnostic turns, reframing.
  • Reflexive reversal: input before model.
Configuration 01 · Takeaway

And what stays hidden.

What's harder
  • Verifying the underlying claims.
    Requires going back to the actual paper.
    The conversation surfaces names; verification is still your reading.
  • Systematic coverage of a literature.
    Mapping with criteria: journals, keywords, inclusion rules.
    The same discipline scholars have always used; conversational AI doesn't replace it.
  • Cross-session continuity.
    Solved by collaborative memoing at the end of a session.
    A standalone artifact that can re-prime any model, any chat.
02
Configuration two

The orchestrated
search

Let agents discover. Then interrogate.

Claude DR · ChatGPT DR · Kimi Agent Swarm
Configuration 02 · part A · design the query
You → Deep Research

"Survey how management researchers treat LLMs as cognitive, evaluative, or psychological entities (2022–2026) — beyond using LLMs as tools."

01 · Theoretical claims

What's being claimed about LLM internals — cognitive, evaluative, motivational, architectural?

02 · Behavioral evidence

Output patterns, human comparisons, behavioral paradigms, psychometric administration.

03 · Inferential moves

Where vocabulary borrows from psychology, behavioral economics, organizational theory.

04 · Countercurrents

Lin's Six Fallacies, dual-validity, Lindebaum & Fleming — what these critiques target.

+ Compare across fields

AI alignment, interpretability, computational social science, HCI — where do management's standards diverge?

+ Surface the mechanism

Where the same evidence supports multiple accounts. What work imported vocabulary does that the evidence can't.

Principle 02
Engagement design

A deep-research query is a contract.
Be precise about scope, structure, and what you want surfaced — let the agent split the work.

Move 01 · continued

A query is a
methodological
decision

Configuration 02 · ChatGPT · Deep Research
Demo
ChatGPT · Deep Research run
Configuration 02 · Claude · Deep Research
Demo
Claude · Deep Research run
Configuration 02 · part B · adapt to the agent
You → Kimi Swarm

More to do.
More to ask.

Same question

"Survey how management researchers treat LLMs as cognitive, evaluative, or psychological entities (2022–2026) — beyond using LLMs as tools…"

← from Configuration 02 · deep research
+ Latitude

Design the workflow yourself — phases, sub-agents, parallelize where useful.
Redirect if a stream is thinner or richer than expected.
Document what you did and what you skipped.

Different deliverable

A folder, not a chat.

  • /papers — PDFs, Zotero-ready filenames.
  • .bib + .ris — bibliography.
  • stream_[N]_*.md — per-stream summaries.
  • synthesis.md — cross-stream, the analytical core.
  • phase_[N]_notes.md — phase-by-phase reasoning.
  • README.md — decisions, skips, caveats.

Quality bar — foundation for a published lit review. Inferential clarity over coverage breadth.

Principle 02
Engagement design

Be precise about what artifacts you expect — that is our half of the contract.
Be charitable about how the agent orchestrates — it knows its own runtime; let it split the work.
Stay clear on the analytical stages and outputs you want surfaced along the way.

Configuration 02 · part B · orchestration in flight
Demo
Kimi Agent Swarm · orchestration in flight
Move 02 · the indictment

"Most people stop here.
They cite the report and move on.
That's abdication
wearing a lab coat.
"

Move 02 · the credit

The report is raw material.

What it got right
  • Broad coverage across journals and years.
  • Surfaced papers I wouldn't have found on my own.
  • Clean structuring of method families.
Move 02 · the bridge

Take it somewhere new.

Two paths
  • New session, same model.
    Open a fresh session. Hand the report in — not as answer, as evidence.
    The new session has no allegiance to its first take.
  • Cross-agent.
    Feed Kimi's report into Claude. Feed ChatGPT's into Kimi.
    Each model's frame surfaces the others' assumptions.

We'll walk path one in a moment. Path two is yours to take home.

Move 03 · Architect the sessions
New session · reports attached

"Here are deep research reports on how management research treats LLMs as cognitive entities.
Read them critically.
What assumptions do they carry?
What would a qualitative researcher notice that's missing?
What would a critical theorist push back on?"

A new session,
a different epistemic position.

Move 03 · continued

Two modes.

Mode 01

Exploration.

Open-ended. Breadth-first.
The agent is your thinking partner.

"Let's read across the report. Where is it most coherent? Where does it strain?"

Mode 02

 

 
 

 

Principle 02
Engagement design

First, the open read.
The agent is your thinking partner — not your interrogator.

Move 03 · continued

Two modes.

Mode 01

Exploration.

Open-ended. Breadth-first.
The agent is your thinking partner.

"Let's read across the report. Where is it most coherent? Where does it strain?"

Mode 02

Critical.

Step out of the field.
With another field's eyes — charitably both ways.

"Let's read this with AI-research eyes. Where does management's vocabulary diverge from how AI labs frame the same phenomena — and where would a management researcher reasonably push back?"

Not "you are X." Substantive perspective-taking — both sides have legitimate ground.

The difference is what you can't get from either alone.

Principle 02
Engagement design

Then, the critical read.
Collaborative framing — "let's read this as Y" — keeps the agent honest and keeps you in the loop.

Configuration 02 · Takeaway

Discovery and
interrogation, as a designed sequence.

What's visible
  • Engagement design across multiple sessions.
  • Productive suspicion of any single output.
  • A workflow with phases, each with its own purpose.
Configuration 02 · Takeaway

And what stays opaque.

What's harder
  • Breadth comes at the cost of depth.
    The agent often works from titles, abstracts, and secondary write-ups.
    When PDFs are accessible, it reads them, but you have to notice which is which.
  • A gap between your query and the report.
    You can only channel so much understanding into a query.
    Reading the report against your question is where the missing pieces show up.
  • It won't carry your scholarly voice.
    The output is competent prose, not your argument.
    Curating insight through your judgment is still the work.
03
Configuration three

The full
bandwidth

When you can read along.

Claude Code · Codex CLI · Gemini CLI · Kimi CLI
Move 01 · Watch the reasoning trace

Same question, visible process.

▸ claude ~/research/ai-in-mgmt
Reading papers/llm-as-objects/*.md (11 files)
Extracting claim-evidence pairs from each abstract …
Cross-referencing with papers/critical/  (7 files)
Pattern emerging: 8 / 11 management studies make claims about LLM internal mechanisms from output patterns alone. AI-side work reaches for interpretability or intervention designs to back the same claim.
Drafting comparison table → memo-v1.md
Imperial
Move 02 · Steer in real time

All three principles, alive at once.

… continuing from previous turn …
Prioritizing the behavioral-audit anchor as the synthesis spine.
[ stop ]   Why that paper over the critical-methods counter? You're reproducing the dominant framing. What would the margins of this literature look like?
Re-anchoring on the critical-methods counter. Re-running comparison with critical-management methods first …
01 · Thinking through 02 · Engagement design 03 · Inward lens
Imperial
Configuration 03 · the ecosystem

The agent has a workshop.

Demo
The agent reads the same files you preview.
Agentic CLI
terminal · choose your agent
Claude Code
Anthropic
Codex CLI
OpenAI
Gemini CLI
Google
Kimi CLI
Moonshot
~/research/llm-as-objects $
claude --dir ./vault
Reading 28 markdown notes …
Building synthesis from 12 files matching methods/.
Same workflow. Different brain on the other end of the prompt.
Obsidian vault
all markdown, all readable by you AND the agent
📁 vault/
📁 papers/  — PDFs + .md conversions
📄 behavioral-audit-2024.pdf
📄 behavioral-audit-2024.md  ← deep-readable
📄 critical-methods-counter.md
📁 notes/
📄 methodological-landscape.md
📄 assumptions-baked-in.md
📄 cross-disciplinary-leads.md
📁 prompts/  — reusable engagement designs
📄 socratic-pushback.md
📄 fresh-session-interrogation.md
📁 sessions/  — transcripts as artifacts
Preview anywhere
Markdown renders in Obsidian. The agent reads the same file. No format mismatch.
PDF → markdown
Use AI to convert PDFs to markdown. Now you can read them deeply, annotate, and let the agent work against them.
Sessions are files
Drop transcripts in. Reuse prompts. The practice persists across model choices and interfaces.
Configuration 03 · Takeaway

The reasoning trace
becomes an artifact.

What's visible
  • Every choice the agent surfaced — and why.
  • Mid-process intervention as a first-class move.
  • All three principles, simultaneously.
Configuration 03 · Takeaway

And what it costs.

What's harder
  • Setup cost (files, tools, environment), which pays off in the long term.
  • More surface area means more places to drift.
  • More bandwidth ≠ better work. The principles still gate.
Three configurations · one practice

Not levels.
Bandwidths

Configuration 01
Conversational
Configuration 02
Configuration 03
visible in conversation

The narrowest bandwidth, but everything important is on the page.

Three configurations · one practice

Not levels.
Bandwidths

Configuration 01
Conversational
Configuration 02
Deep research
Configuration 03
visible in conversation
visible in interrogation

Discovery you can't do alone, but the search itself stays opaque.

Three configurations · one practice

Not levels.
Bandwidths

Configuration 01
Conversational
Configuration 02
Deep research
Configuration 03
Agentic
visible in conversation
visible in interrogation
visible in real-time steering

Same principles. Same practice.

04
What we did

Synthesis

The question sharpened — and the sharpening is the finding.

The question, before and after
Where we started

When management researchers run audits, evaluations, or experiments on LLMs, what methods do they use and what kinds of claims do they make based on those methods?

A reasonable opening, but soft. Look at what it becomes.

The question, before and after

When management researchers run audits, evaluations, or experiments on LLMs, what methods do they use and what kinds of claims do they make based on those methods?

Where we ended

When management researchers treat LLMs as cognitive entities — evaluators, decision-makers, reasoners — what evidence licenses those claims, and what kinds of claims can that evidence actually support?

The question you end with matters more than the question you start with.

Principles, revisited

From abstractions to craft.

01
Thinking through.

Remember when I rewrote the prompt because my first version was too generic.

02
Engagement design.

Remember when I took the deep-research report to a fresh session for interrogation.

03
The inward lens.

Remember when the output went flat and I looked at my input instead of the model.

"The purpose of this talk is not to teach you the content. It's to change how you think about the process."

Principles, revisited

From abstractions to craft.

01
Thinking through.

Remember when I rewrote the prompt because my first version was too generic.

02
Engagement design.

Remember when I took the deep-research report to a fresh session for interrogation.

03
The inward lens.

Remember when the output went flat and I looked at my input instead of the model.

"The purpose of this talk is not to teach you the content. It's to change how you think about the process."

Principles, revisited

From abstractions to craft.

01
Thinking through.

Remember when I rewrote the prompt because my first version was too generic.

02
Engagement design.

Remember when I took the deep-research report to a fresh session for interrogation.

03
The inward lens.

Remember when the output went flat and I looked at my input instead of the model.

"The purpose of this talk is not to teach you the content. It's to change how you think about the process."

The framework behind the workshop

Interpretive
Orchestration

Just published

Lin, X. & Corley, K. G. (2026).  Strategic Organization.

Three models. Each maps to one configuration you just saw.

The framework · 3 models, 3 configurations

One practice,
three strategic models.

From Lin & Corley, Interpretive Orchestration. The models are the practice; the configurations are how it shows up at different bandwidths.

No.
Strategic model
Configuration
01
Socratic tension

Deliberate contradiction to surface assumptions.

Configuration 01
Conversational diagnosis
02
Euclidean documentation

Systematic context-building for reproducibility.

Configuration 02
Deep-research orchestration
03
Vitruvian mastery

Reading across independent analytical passes.

Configuration 03
Agentic, full-bandwidth
Take this with you

Resources.

Start here · no setup

  • Claude  claude.ai
  • Kimi  kimi.ai
  • ChatGPT  chatgpt.com
  • Gemini  gemini.google.com

Go deeper

  • Claude Code + Obsidian setup
  • Interpretive Orchestration paper
    papers.ssrn.com / abstract_id=6629679
  • Markdown source on GitHub
    github.com/linxule/interpretive-orchestration-paper

Supplementary readings

  • Epistemic Voids #1 — Citation Theater  threadcounts.org
  • Research with AI #1 — The Foreclosure Problem  threadcounts.org
  • LOOM XVI — Are You Climbing the Right Hill?  threadcounts.org
  • Alvesson & Sandberg (2011), AMR 36(2), 247–271
Imperial
Closing

The question you end with matters more than the question you start with.

AI-augmented methods accelerate that movement and make it more visible.

They demand deeper and more rigorous engagements from us.

Xule Lin
xule.lin@imperial.ac.uk
linxule.com  ·  threadcounts.org  ·  research-memex.org
Imperial College Business School