Latent Space

December 31, 2025

[State of AI Papers 2025] Fixing Research with Social Signals, OCR & Implementation — Team AlphaXiv

State of AI Papers 2025

By Latent Space

Quick Insight: This summary is for builders tired of the arXiv firehose who need to separate signal from academic noise. It explains how AlphaXiv turns static PDFs into living codebases using social signals and AI agents.

💡 Why are papers becoming "puff pieces" for code?
💡 Which open-source models are winning the RL fine-tuning race?
💡 How can social signals fix the broken peer-review system?

Podcast Link: Click here to listen

Top 3 Ideas

🏗️ The Implementation Pivot

"At the end of the day, papers are a puff piece for the implementation."

Code Over Prose: Researchers care about what works in production rather than theoretical benchmarks. This reality forces authors to prioritize reproducible GitHub repos over flashy PDF charts.
The Docker Frontier: AlphaXiv plans to maintain Docker images for popular papers to enable one-click experimentation. This removes the dependency hell that usually kills research adoption.
Incentivizing Ease: Ranking papers by implementation ease creates a new competitive metric for academics. Authors will write cleaner code to climb the AlphaXiv discovery feed.

🏗️ Social Signal vs Semantic Noise

"If you just try to build a semantic assistant, things get noisy fast."

Contextual Ranking: Pure vector search fails because researchers use the same buzzwords for low-quality work. AlphaXiv uses view counts and Twitter engagement to surface real breakthroughs.
The YouTube Filter: Integrating signals from creators like Emergent Minds adds a layer of human curation to the algorithm. This turns a cold index into a warm recommendation engine.

🏗️ The RL Agent Revolution

"Qwen was specifically designed to be very sample efficient when it comes to RL fine-tuning."

Qwen Dominance: Startups are flocking to Qwen 2.5 for agentic tasks due to its efficiency. This makes it the base layer for the next generation of reasoning models.
Automated Science: Agents like Agent Laboratory are beginning to automate literature reviews and experiment design. While end-to-end automation is distant, the virtual lab assistant is already here.

Actionable Takeaways

🌐 The Macro Trend: Academic research is transitioning from a "publish or perish" PDF culture to an "implement or ignore" code culture.
⚡ The Tactical Edge: Use AlphaXiv to filter research by social signal and implementation ease rather than just keyword relevance.
🎯 The Bottom Line: The PDF is an antiquated artifact. In 2025, the value of a paper is measured by the speed at which a developer can spin up its Docker container.

Hi guys, welcome to Lind Space. Thanks for having us. You guys are the founders of AlphaCive, which I've confirmed is the official pronunciation. What's each of you's sort of origin story of coming together and starting this?

Sure. Yeah. So, I'm Raj. I guess all of us were at Stanford before. I was roommates with Rayhon actually and still am for last like 5 years and I guess yeah we started off doing you know AI research like a lot of others at Stanford and was in Dorsa's lab also working with Percy a little bit and was kind of working at the intersection of robotics and NOP and I think honestly like the real origin story is one day Rahan and I we were in like our operating systems class it was like really late at like 1:00 a.m. Rahan's like, "Hey, do you want to see my like final webdev class project?" And I'm like, "Hey, what makes you think I want to see that right now? We're trying to get this like pet in and go home." And he was like, "No, you should look at it. It's pretty cool." And I was like, "Okay, sure. Show it." And it was basically like a really jank view comment button next to like every paragraph of like an archive paper. And the idea is like, "Hey, you could like comment on papers." And I was like, "Hey, this is like kind of cool. we should just work on it. It's kind of strange that there aren't comments on archive papers given that how many people are reading and like AI papers is are growing exponentially.

So, we kind of just started to work on it for fun. We showed it to like our lab and they're like this pretty cool like they'd use it if it if it takes off. But I think it really became serious when like it went viral like one of our friends who was at fair at the time posted about it on LinkedIn went viral and yeah from there we were like we started working with Sebastian who was our adviser and yeah it's been pretty cool.

Yeah, online. Yeah, my name's Rahan. I before working on AlphaCive as an undergrad was doing research in robotics and RL. A lot of this was still before GPT. So, the thinking was that as an undergrad as as as a meager undergrad, your PhD mentor gives you papers and you struggle to understand them. And so, the thinking was that there should be some stack overflow analogy for for archive papers that surely there are other undergrads out here that have these questions. And obviously since then it's taken many forms but that was kind of the initial initial iteration.

I mean like Raj and Ron I also did research in undergrad this sparse deep burning research with Stanford's group and I think kind of like what is like we all had this shared experience as undergrad researchers and it was like it seems very simple you know just the simple premise of comment section for arcade papers. Yeah, and it was just sort of a project that we worked on, you know, 20, 30 hours a week whenever we had, you know, free time. And obviously it's just sort of sort of launched from that and it's been a really exciting journey.

I guess the other the the reason that when I first saw AlphaCive, I was I didn't necessarily think about it as new was because I knew that Hugging Face had also launched like a paper discussion thing and like, well, so you know, Hugging Face is huge. Why did you seem to win or break out versus them?

Yeah. Yeah. I think there were a few things. Even when we were just comments, I think we I think the core of it was we were our products ICP. Like we knew hugging face papers existed. We felt like some of those problems were unsolved. Something as simple as being able to comment directly on the paper that the user interface. I think when we started we were very deliberate about getting authors like Laur from Laura DPO Lama and actually having really useful cool exchanges and then those would get shared and did you like reach out to them and say hey please do it here.

Yeah we would use our existing kind of like people from people we knew from research that's rough. Yeah. Yeah. But but from there it would grow like really really quickly and I think from comments there's been a really natural progression that we've we've taken that hasn't been with other kind of paper platforms. So when commenting took off, we're like, "Okay, we have a good idea of what papers people are reading, right?" So if you're trying to discover papers, on one end of the spectrum is like the archive sort by new, right? So you have like a thousand papers every day sort by new and on the other end of the spectrum is like Twitter. So like, okay, actually this is a good chance to put together some higher signal feed of papers based on the interaction. So commenting became a feed of papers and then from there it's like, oh, a lot of the questions people are asking can be probably answered by AI. Gemini came out with the 1 million context thing. Okay, let's throw that on here. So we've kind of progressed and built a more cohesive experience just growing our user base and and and pulling that thread.

Just on a Gemini note, do you parse PDF to text and then context or just throw the images in?

So so it depends on the models some of these like we support for a bunch of different models. Some are just text based. We with models like claude we try to pass in the relevant diagrams as well.

So what's the best PDF parsing model?

Oo that's a good question. In the last month, there's been like a flurry of different OCR models that have that have come out starting.

Well, yeah, Deepseek, how's that?

So, I think in terms of cost and accuracy, DeepSeek is pretty good. Like, if you if you host it on your own A100s and just like you you batch things properly, probably Deep Seek is best bang for your buck. There are services like M like Mistral that have their APIs, those are probably OCR and those might be a little bit more expensive if if you're using their API offering, but I think like deepse is very very good.

Any Moon Dream?

Oof. I probably add later. Yeah. Yeah. But the OCR example is really funny. Like it's one of the cool things we see on AlphaCive, which is on day one, you know, someone will release an OCR mode and then the next two three days like four other people will who've been probably working on this for the last many months will go put out their own OCR mode. Yeah. And so it's kind of cool to see the the value that Alpha Kai brings there. Yeah. Yeah. Cool.

So what's the progress of the the company? You guys decided it was a side project. when did you decided to I guess take it to the next step and also where are you at now?

Yeah, for sure. So I think for the first year or two of working on it was a project and what really forced that transition for us was we were looking at you know papers with code weights and biases hug and face and we're like okay we have a user base that really loves what we're doing and obviously a research extends so much more beyond papers right like we want to do things with other artifacts of research. One of the things we're we're going to be working on is making it really easy for people to play around with like implementations of tapers directly on the site.

Are you are you uh fully fully replacement for papers with code already or

I would say so. Yeah, cuz we have like benchmarks and a lot of like the state-of-the-art page. I know a lot of people Yeah, it was taken down. So, uh we've added the feed, the benchmarks, whatever. But I think because I think the thing that drove us to be a company was there were so many different artifacts of research beyond papers and you know it just kind of made sense to go from there. Yeah, I don't know if any the other the others want to chime in.

Yeah, I think like like kind of the analogy of like archive existed and we came in and like built like an intelligent layer over archive, you know, much nicer interface with tools to quickly understand core ideas on paper is comment like if we can do the same thing for like benchmarks, right? You mentioned like papers with code used to be good. it was still a lot of like communitydriven like people would upload their own benchmarks or like you know um implementations or whatnot like now with LLMs it's like really easy like we we can just parse through like using OCR like from charts and tables like what are the leading like models and papers ideas on benchmarks and just bring them all in one place and and kind of one way to think of it is like just help people make sense of like the fire hose of AI research and that's just not papers like benchmarks you know models So yeah, implementation.

Yeah, I mean the thing I really want which which uh is easy to do now is um just like put out a monitor of like when this topic comes up, let me know. Uh so like a custom feed of stuff.

Yeah, that's also where we want to put a lot of work into our assistant is like kind of have the best research assistant that you know knows your interests and can like give you most relevant and upto-date like research. So it's really like a personalization and ranking system, a recommendation system of papers.

Anything else that is underappreciated? Like do you see yourself as a social network, a rexist?

Good reads. No. Yeah. We've gone through multiple iterations and I think where we kind of provide the most value is as concretely like not a social network but a kind of a tool over research and maybe there's some some signal that's from there some type of social signal but concretely people use this as a tool to understand research and I think one thing that's underappreciated uh is that papers are like the tip of the iceberg of research and I think we're very well positioned to help people beyond just the workflow of reading papers.

So you could imagine uh one thing we're really interested in is it's not uh publicly released yet uh but we want to start maintaining like docker images for papers and make it really easy for people to spin up implementations directly from their browser. And you know a lot of times when people are reading papers why are they reading papers? Our audience is like a lot of applied researchers. So they're figuring out what research is relevant for them. And to get that you need more than what a paper can tell you. At the end of the papers are great but they're like a puff piece for the implementation. That's what they are. And so getting closer to what the signal of the research is is means you need to allow people to play around with the implementation.

So I think one one thing that's underappreciated is how much potential there is here to do things beyond uh papers in this in this realm of making tools for for researchers. I think even with the the current state of the site I would say that if you want a sort of bird's eye view on the state of AI research on the state of AI in general I think AlphaCive is the best place to do it. And if you look at places like Twitter, I mean it's there's a lot of stuff on Twitter, but it's very noisy. If you look at hugging face, sure there's, you know, the model 8 data sets, there's certain pieces here, but it's not really a cohesive holistic experience.

And I think that something that we've done is obviously we started as a comment section, but we sort of created a platform that anyone ranging from ML researchers to like VCs if they just want to stay up to date with what's going on in AI, I think alive has become that place. And obviously like Ran has alluded to, we don't want to go deeper. It's we don't want AlphaCive just to be a place where you like you just look at things and you're just surveying things. You want it also to be a place where eventually you are doing your own experimentation. You are actually working with lots of implementations of papers and I think that we're just sort of scratching up surface of what's possible here.

So, uh, we're here in Europe. Uh, you've been here the longest and you have some like I asked you to guys to like maybe talk about like maybe talk papers of the year, talk papers on new rips, your favorite paper that you can't shut up about. Obviously, working at AlphaCive, you have to love papers.

Yeah, go for it. Go ahead. Okay. Sure. I think so one category of papers and I'll start by prefacing we saw the quote on on Twitter from from Ilia that like the age of scaling is down now it's aging I feel like we see some embodiment of that in alphaive where obviously the people posting open research may not always have thousands of GPUs right and so you're forced to come up with like really creative solutions that uh to to problems with limited compute and you know whether these translate to like practical applications or not is maybe a separate thing but I always find these type of papers very interesting.

So one paper that like for the last month I can't shut up about is a tiny recursive models TRM simplification of related to HRM just a simplification of HRM which is like biologically inspired TRM cuts out all the biological inspiration says hey let's take like a 7 million parameter transformer model have it pass its latent vector and and output itself recursively and use that so a 7 million parameter model to get like really really good results relatively speaking on really specific but challenging puzzle tasks. So like Sudoku, ArcGI, I believe he gets like 45 50% on ArcGI1 compared to like it's not state-of-the-art, but you compare that to like Deep Seek R1 or you know Gemini 2.5 is like pretty good. So I think these types of creative creative solutions are really fun. They're wellreceived on AlphaCive all the time like when when these type of papers come about and so I think just as a as a nerd for papers I find that really really cool.

Another one is like really recent I believe from like the last week or two the evolutionary strategies or at at hypers scale. So basically we're going to cut out gradient descent and use like low rank perturbations at like million parameter scale and get like really really good results. So so use search rather than gradient descent. Yeah, some random perturbations. You can't do this over a million parameters at once, right? that doesn't make but you can make a low rank perturbation sets of low rank perturbations and then apply that and so it's evolutionary right and get like pretty good results compared to like I believe based compared to like GRPL or on some tasks and I I thought that was also it was trending on alpha and I was like this is a works like this [laughter] yeah and again whether it scales to anything I don't know but it's it's cool to see uh it's kind of work love it yeah great Rex Let's let's keep going around.

I think one area that's also been really exciting especially among our user base and I'm seeing here at Europe as well is like AI for science like kind of you know we're starting a dedicated AI for science pod. It's my my second podcast that that's awesome. Super exciting. Yeah. We we kind of have like a big community as well. And actually we're doing an event on Saturday. Oh Yeah. Um well I'll I'll bring my new host here. Awesome. Yeah, that'll be exciting. So we have like one of the co-founders of Laya um James Zo from Stanford um Jeff Cloon as well and I think yeah like I I I was first really excited I saw this paper it was earlier this year called agent laboratory um from Sam Schmidgo from deep mind and the idea is like hey like what if we basically automate the scientific process itself you know like basically it would take like have different LM agents for like different parts of the scientific method like one for like literature review one for like running experiments and one for like analyzing those results and writing the paper.

And and it was pretty impressive. I mean, it's still like I think going to take some time like a lot of like human like in the loop still like makes a lot of impact but like they were able to show that you know like with like a few dollars from like idea to like final report for for certain topics. is like pretty interesting and I mean we're seeing a lot of companies too kind of forming in this area like Axiom recently um kind yeah like it's it's I think like a really interesting space of I have a problem with it maybe maybe others do maybe others don't of just lumping everything into science yeah it's definitely like pretty like broad like there's like you know like like you said like math and then there's like more like the life sciences which feels like a little bit more difficult Well, we have nothing in common. Like um like like automating like ML research seems a little bit easier. That's like kind of my in like intuition than like you know building like AI for ML research. Can we agree that it doesn't count as AI for science like like I think that's a that's a fair take. Yeah. Then everything is nice. I don't know. You can make an argument ML research. It's not re [laughter] like more of an art. Okay. Okay. So, just generally AI for science.

Any particular paper or discussion that you've had in the last couple days that like sticks in your head?

Yeah, I think I think uh yeah, like the Asian laboratory paper. Oh, yeah. Asian lab. Okay. Yeah. So, my comment on that is every quarter there's somebody who will be like, "Oh, yeah, we've done AI scientists." Yeah. And then I don't know if it goes anywhere. I don't know if it's like a proof of concept. Like what is going on? It depends on the example. like going back to the AI for life science. I'll make the distinction one of we had a speaker he wrote this paper called bio only keshing and it started as a paper right now it's you know he's kind of got a lot of traction and he's kind of continuing to develop it I think that was the first time it's like okay conceptualized it people are using it as a tool and obvious I think things are going to take time part of it is maybe to get traction you call it like you know it's we're automating all of science but it seems to be like in his case with biiomni that there are people at pharmaceutical companies that are or or by organizations that are using it as a tool. And to what capacity, I'm not a life science expert, but I think there's probably steps from it being a tool to like automating like a lot of things.

I'm skeptical of like the automation of the whole process. I I like the idea of thinking it more of as a virtual lab. So James O frames these AI scientists as as a virtual lab where it's like, okay, it's cheap to experiment and explore. Exactly. So the person is the PI and I think that vision makes a lot of sense. I think to replace I don't know how much of it is pitching versus science where someone's like we're going to just completely build a scientist because in my way one of the thesis of Alpha like as we go on there's going to be more researchers and maybe there's more tools to help them but the actual like process of doing research is like like human human work is going to be like that is one of the final frontiers of human work is to be able to intuit it and and ideate and figure out what's next right so I see these as all like tools um yeah I think I think I think just extending what Rayon said I think yeah maybe for like you know pitching or like what we see in the news like people are like oh we're building like AI scientists or like an AI material scientist but you don't actually need to do like that whole endto-end workflow to provide value right like what Rayon was me me me me me me me me me me me me me me me me mentioning about Bioni like I know they're like providing a lot of value with like their tools for like oh we'll like sort through the literature for you right and like find out what's what's relevant and then like help you get some experiment set out you know that's like you're not automating the whole workflow but It's it's saves people a lot of time where like researchers or biotech you know applied people.

So I think yeah like I'm also align with you that a little skeptical about automating that whole end toend workflow but I think there's value intermediate parts. Yeah skeptical that it is actually put in practice rather than just done for a paper for marketing purposes. Yeah. But yeah we we can move on to other papers. I was just touching touching on that one one last thing just a lot of the science that is done is not sound very sexy. It's a lot of like oblations. It's a lot of very sort of like road work that's manual and obviously it's like it's not sort of the doesn't require the ingenuity or sort of the the human brilliance but I think that agents are very much capable of doing a lot of the things that are required for like a lot of scientific processes. And so I think there's a tremendous value there. Yeah. It's not like quite what you're talking about like a men scientist but I think there there is something there for sure.

Your paper.

Oh yeah. I think I mean for me I think something that has emerged in the last year or so has sort of been building agents with RL um for you know long horizon you know complex paths and I guess one specific paper that you know I found interesting is like agent R1 and so they've basically agent R1 agent R1 I don't I think I missed that. Um I mean what they did is they they built an an an endto-end sort of RL framework for training agents on sort of like these complex reasoning tasks. And the the specific reasoning task that they chose was sort of like multihop uh question answering. And so you have to train an agent to be able to interact with sort of a tool environment and and sort of ask uh and respond to complex queries. And so, you know, when you've applied that framework on top of, you know, a base model like quen 2.5, you suddenly now have an agent that is, you know, very very good at these sort of these more complex tasks.

And I think the reason why this is interesting to me is because you see a lot of parallels in this paper to what's actually happening in like industry, you know, like one, you know, one example of this is just like cursor, their composure model. There's a lot of rumors online that that this was built on top of you know qu 2.5 people saw like Chinese reasoning traces and whatnot but I mean we don't know exactly how they built it but sort of these these techniques you see them I see them yeah I mean yeah I mean there's a very popular technique to like you know build on top of 2.5 but I mean there's a whole crop of like you know RL as a service finetuning as a service these sorts of things to help you know companies build in-house agents and so a lot of what we see you know in enterprise industry these days is a reflection of sort a lot of the research that's done with, you know, AI agents um in RL. So, I think it's definitely very exciting.

And one thing to add there is like so there's Frontier Labs aside, there's these startups that have raised like collectively hundreds of millions of dollars. They're all like fine-tuning Quen like specifically Quinn because there's like Quen if you like talk to some of these folks like Quen was specifically designed to be like very sample efficient when it comes to like RL fine tuning. like they designed this but when when building the model and it's so cool to see all these different startups and goes right back to Quen and like the open source landscape and then yeah you see papers like agent R1 that but was that deepse literally DC one or so I I don't know why they threw in R1 in the name but they were fine the Quinn model right okay I was like why is it yeah I don't know uh Justin's here uh from from the Quen team u yeah so he's roaming around and uh I I think the this is very much Quinn's Yeah, I think uh in fact comparatively I think DeepS actually started the year super high and then fell off a little bit which is kind of interesting. I was just talking with one of the uh folks here about presence of deepseek as far as like what we see on alphaive they still pull a lot of weight in terms of when they put out work people listen um even great work it's great work so an example was with OCR right they put out DC OCR and then like the next day like OCR 2 comes out the the startup called Chandra puts out an OCR uh model so yeah but if I if I'm not mistaken their numbers were worse than deepse still yes so but but I think it still goes back to the point of like when Deep Seek publishes other people are very much still intact. So whether or not they I don't know it's it's shortterm to say it's all enough or not that like difference of one day it's not there's not motivated by Deep Seek I don't know. No right but I think they feel pressure it's it's like the parallel between research and like product releases almost is like you kind of see that here.

Yeah. I mean look a lot of interesting papers and and uh coverage in New York we just covered a best paper for DRL. I think another like what I'm trying to make kind of exploring is like just the health of the academic paper ecosystem which obviously you guys have care a lot about my sense from talking with area chairs and program chairs is that there's a lot of review process is not keeping up therefore the conference quality process uh submissions and quality is just going down. ICLR had a big scandal recently with leaking names and like 20% of their reviews are AI. Yeah. What's your take on just like all this? It's uh it's a show for sure. There's I think at multiple levels. One fun project we've wanted to do in AlphaCive. I'm skeptical on if we this would be good for us or not. But first of all, we've always been hesitant to put out tools that allow people to write papers with AI. We help people understand papers with AI. We feel like something feels wrong about like, okay, people are going to write papers, which they're probably already doing. But one thing we want to put out is which papers that are trending are actually like AI generated and you'll probably see at a very high scale a lot of these there's a lot of pressure to put out lots of work and do it really fast and uh there's a lot of pressure to write papers quickly and uh it's a very much AI in AI out thing that we're going into and so it's definitely problematic and yeah I I guess there's pressure to publish and then there's the AI tools that help people write papers and uh the quality of the paper itself I feel uh has probably taken a hit but yeah I think it's also interesting.

Like I know some people are also working on like AI reviewers too, which is also interesting. And I mean I have some concerns, but I think overall if done right, it could be good for for authors actually if like authors get access to this and they're able to like just run their papers through these reviewers before and they can see like, okay, like what's maybe unclear in my paper and like what can I do to make it better? And and maybe that just like now the reviewer doesn't need to spend as much time just trying to like understand and parse through and can just like actually point out like or or do like a literature review and like understand. Yeah. It's like a llinter like quality check. Yeah. If it's not done right then maybe people like find a way to like hack around the you know AI repair. Um I know Andrew recently put out like the Stanford agentic reviewer which has been getting some traction but um oh I wasn't aware of that. Are you guys developing it? Uh, we haven't really thought too much about it yet. Yeah, I think like one of the tools we'll want to incorporate is some type of reviewer just for the assistance of for an author before they publish like hey let me through we can tell you how much is obviously yeah I generated what parts are like what works are similar and kind of do a breakdown. I I think the llinter analogy is good. Like I think assessing novelty is probably tricky, but as a first step, I think it can like weed out like either low quality work or I think that's probably promising.

Yeah, I I I was wondering just specifically on novelty and lit review, how important is search and your presumably you have an index and you you do rag over the index and all that? Yeah. Um are models good at it?

So I I guess it depends on the type of queries. I I guess when it when it comes to novelty is concerned like archive has had this problem like even early on where people would use it for like I forget the exact term but people would post a paper even when it's not fully developed and then you kind of claim for preprints right right for preprints right so I think in terms of finding relevant literature it's it's good one of the things we kind of where our assistant differs is we'll use social signal as well so maybe someone has some paper that wasn't that didn't get a lot of traction and um I think just building semantic search isn't super helpful. Um but yeah, so like Twitter or other signals, it's a combination. So like you you don't reveal the the So we'll use like view count, semantic signal, whatever. Basically, if you kind of rambling there, but if you just try to build a semantic assistant, things get noisy fast. So like we have an index over like these people throw in buzzwords. Yeah, exactly. Of course. Right. Right. So um we have like three million papers in archive. If you just try to use semantic and given some query like what are papers that do agentic reasoning you are not going to find the relevant work. So what we do is like we know what papers people are reading and that's a factor that weights in when we when we pull up relevant results. This is something we've talked about with other like other people have built like semantic searcher like semantic search tools on top of archive. Inevitably it sounds good and then it just ends up like not returning relevant info. Again though the reason I was kind of rambling before is we build the assistant more from the angle of literature review rather than assessing novelty. So so and those are maybe two different things. If you're purely trying to assess novelty maybe the social signal doesn't matter in terms of building like a really good literature review assistant. You need some way hitch on like at the end of these are three million papers. A lot of them are low quality. So you need some signal.

Um, I have my doubts on whether just Twitter is the the best form of social signal there, but we have our own in-house. I'll also shout out Emergent Minds, uh, which is turned YouTube into a signal, which are very Yeah, they they also we we like the guy Matt. He like he's part of our Lean Space Discord. Nice. Yeah, he So, he also has like the Alpha Guy like stats on. So, you can sort Yeah, I also pan on. Ah, yeah, I see. I see. We have like a lightweight collab, but the YouTube thing is his thing. Not not yet. Yeah. Yeah. I find the YouTube thing very very useful.

Kind of extending off that like I think uh you know nowadays research it's it's not just the paper right it's like a whole package of like more and more people are opening the code base which is great and and we hope to continue seeing that and so like there's a paper GitHub if it's available people have like a tweet thread there's like a website if it's like robotics for instance there's like video demos like there's so much more than just the paper. So like you know one day if we can index all of that into our assistant kind of make it as good as possible for like finding all of this relevant research information you you already mentioned you know we're doing papers with codes so like benchmarks as well like I think that will be really powerful.

Yeah, all I will say is a lot of people are cheating at the GitHub thing where they only put put like a rem and we make sure to not like okay one thing to build on that is in the next five 10 years like the paper as an artifact is going to be less and less useful like I I think like research will move away and we're feeling that here right like at the end I was saying this earlier about what a paper is it's like describing what the new implementation is what the numbers were but that's an implementation like how much does the paper tell you there but I think having really organized useful implementations is going to be the future of research that you know paper is just a PDF is such an antiquated way of sharing information um is Lindy right and that's why we exist and make that more frictionless it also like totally aligns with what we see in our user base today like we have like tons of users who are like not necessarily trained researchers like they're not in academia they're in industry they're not at like the big labs they're like kind of what we call like applied AI researchers or like research engineers and like they're at companies you know like for instance, like they're obviously big companies, but they're obviously there there's also like companies like Spotify, Expedia, Nintendo, where it's like you don't traditionally associate with like AI research, but like as part of their job now, like they need to keep up to date with the latest research for like some product or feature that they're building, right? And it's like to be honest, a lot of them don't really care about papers. They're just like it's part of the job and like for them it's like really hard starting with what paper should I be reading to okay like how do I quickly understand the core idea and what they all ultimately want to know is like will this work for like what I'm trying to build right and like getting to from that like paper to implementation is like a huge gap right now starting with like like you said yeah people are cheating on the GitHub like it's not actually useful or like even if it's there like a lot of times like setting up the codebase with the right dependencies is like a huge pain so we're thinking about Brhan mentioned like doing docker for containers for papers and like yeah that was the original idea for replicate yeah they pivoted I guess and then I think now I would point to harbor I don't know if from the terminal bench folks that would be an interesting equivalent not not not paper focused more environment focused but you could reuse the infra I mean yeah this just going back onto sort of the whole like the whole paper discussion I think like you just have to look at the graph of like archive submissions the last like 10 years it's just exponential and you basically see like this the CS Yes, exactly. Yes, the AI like 30,000 a month. It's great. I mean like and you can have things like an you know like an AI agent to review these stuff for like peer review, but like ultimately I think the the real the end state here needs to be that we just don't care about papers as much. At the end of the day like all researchers want is to useful ideas. Just open source the repo, open source the weights and give me a sandbox where I can quickly verify your thoughts or verify whatever ideas and whatever contribution that you'

[State of AI Papers 2025] Fixing Research with Social Signals, OCR & Implementation — Team AlphaXiv

State of AI Papers 2025

Top 3 Ideas

🏗️ The Implementation Pivot

🏗️ Social Signal vs Semantic Noise

🏗️ The RL Agent Revolution

Actionable Takeaways

Others You May Like

Dario Amodei and Dwarkesh Patel – Exponential Scaling vs. Real World Friction

The Deflationary Singularity: Why Everything is Going to ZERO w/ Salim Ismail

What If Intelligence Didn't Evolve? It "Was There" From the Start! - Blaise Agüera y Arcas

[State of AI Papers 2025] Fixing Research with Social Signals, OCR & Implementation — Team AlphaXiv

State of AI Papers 2025

Top 3 Ideas

🏗️ The Implementation Pivot

🏗️ Social Signal vs Semantic Noise

🏗️ The RL Agent Revolution

Actionable Takeaways

Join 10,000+ smart readers on our AI newsletter and stay ahead of the curve

Others You May Like

Dario Amodei and Dwarkesh Patel – Exponential Scaling vs. Real World Friction

The Deflationary Singularity: Why Everything is Going to ZERO w/ Salim Ismail

What If Intelligence Didn't Evolve? It "Was There" From the Start! - Blaise Agüera y Arcas