Last Week in AI

#183 - OpenAI o1, Adobe vid gen, Reflection 70B, DeepMind AlphaProteo

Sep 26, 2024
Listen Now

Summary

In episode #183 of Skynet Today, hosts Andrey Kurenkov and Jeremie Harris provide a comprehensive overview of recent developments in the AI landscape. They discuss OpenAI's new O1 and O1 mini models, which enhance reasoning capabilities but also introduce new costs for developers. Adobe's foray into video generation with Firefly and Anthropic's Claude Enterprise, focused on AI safety, are highlighted, emphasizing a shift towards enterprise solutions. The episode also covers advanced models like LLAMA3, which excels in generating novel ideas, and DeepMind's AlphaProteo for protein generation, showcasing the intersection of AI and biotechnology. The podcast further touches on the competitive landscape in AI, reflecting on the financial implications for startups and the necessity for safeguards in AI applications.

Key Takeaways

  • 1OpenAI's new O1 models enhance reasoning capabilities but come with increased usage costs, raising accessibility questions.
  • 2Adobe's video generation tools signal a growing emphasis on integrating AI into creative workflows, emphasizing the demand for advanced technologies.
  • 3The rise of companies like Anthropic demonstrates a strategic shift in the AI industry towards enterprise-focused applications and safety.
  • 4DeepMind's AlphaProteo marks a significant advancement in AI's role in biotechnology, particularly for protein modeling.
  • 5The competitive AI landscape is rapidly evolving, with substantial investments and innovations emerging from various tech giants.
  • 6The need for ethical considerations in AI, particularly concerning content generation and public figures, is highlighted.
  • 7Developers face challenges such as rising expenses and the lack of transparency in AI reasoning, which affects trust in AI outputs.

Notable Quotes

"Join us in this journey where AI reaches for the stars, embodying the community of innovators working relentlessly towards a brighter tomorrow."

"Eleven million minds, uniting like a star, represent an unprecedented collaboration that paves the way for limitless potential in technology and thought."

"Open skies, fun, sing, futures bright, though the path is paved with uncertainties, it is our collective creativity that will guide us forward."

"AlphaFold 2 is a practical and hands-on breakthrough because it leads to actual medical advances. It's exciting to see more great work from Google DeepMind on protein folding."

"OpenAI certainly is doing well right now, with their latest funding round suggesting they're looking at a valuation of about $150 billion, which puts them among the highest-valued privately held companies on planet Earth."

""So far, Sakana hasn't done too much as far as we've seen. We've seen some research come out of them, but not much else to my knowledge." This reflects the current stage of Sakana AI as they secure funding but have yet to deliver substantial contributions in the AI field."

"The article mentions, apparently, Microsoft, Apple, NVIDIA are all discussing substantial investments in AI, which indicates a fierce competition and a growing interest in the space."

"The costs associated with inference have seen a drastic rise; you're looking at $20 a month for the GPT-4-0 subscription, which indicates a high demand and possibly an adjustment in accessibility."

"One of the things we know is that OpenAI has communicated with its employees about the potential of a tender offer, allowing them to sell some of their shares, which indicates a move towards liquidity for investors."

""Japan's economy has lagged over the last two decades. I mean, it's been a slow time since the late eighties or nineties..." This highlights the prolonged difficulties faced by Japan's economy, emphasizing the importance of new initiatives such as those from Sakana AI for future prospects."

""The CEO apologized, said that he got excited, jumped the gun." This statement illustrates the pressure and challenges leaders in the tech space face, especially when managing expectations during fundraising and development phases."

"The white paper they released is focused on results just from looking at high affinity protein binders, reflecting the continuous gap between AI technology and tangible benefits in medicine."

"OpenAI is not allowing us to see that reasoning trace. That's a very interesting strategic decision, which they rationalize by saying that they’d love to give you access, but competitive advantage is at stake."

"They don't want the chain of thought, the reasoning trace, to be available for people to download and train their own models on. This decision reflects a desire to maintain proprietary control over their advancements in AI."

"It's interesting because we've seen people speculate that OpenAI might think about charging up to $2,000 a month for access to potentially the O1 model."

"This is not the kind of person who you would expect to come out and just try to fraud the crap out of people. It's a ridiculously short-sighted fraud to be intentionally executed, as it compromises the integrity of the entire tech community."

"When they say they are exploring these ideas regarding AI hallucinations, and how they mitigate them through real-world knowledge graphs, they're addressing a very pertinent issue that's eroding trust in AI technologies."

"Strawberry is not just another AI model; it is designed to excel in complex reasoning tasks that previous models struggled with. This positions it as a potential game changer in AI applications."

"With Strawberry, we are witnessing an increase in generation time, where it can take upwards of 20 seconds to output a response. This reflects the depth of reasoning it undertakes before producing results."

"The AI generated ideas were more novel on average, with scores indicating an average of 5.62 for AI ideas compared to just 4.86 for human ideas. This suggests a significant capacity for AI in idea generation."

"Human-level persuasion capabilities are indeed present in the current AI context. This indicates that models may influence opinions and behaviors akin to human agents."

"As a reviewer of papers, when I look at a result like this, I question the validity especially when it comes to claims made without thorough peer review."

"The data suggests that individuals using the model can do better than 90% of humans on competitive programming questions, exceeding expectations and setting a new benchmark for AI performance."

"We've now established robust scaling laws that indicate as you input more data, leveraging more compute resources, there's a reliable increase in performance metrics, particularly in next-word prediction tasks."

"The highlight or the most surprising bit is actually that detail about not providing the reasoning traces. It impacts how developers can trust and debug AI outputs efficiently."

"This is going to limit its ability to do a lot of problems that would require transparency in decision-making. And to me, it's crucial for developer trust."

"So you're being charged more for generating output, and it's outputting more tokens per input, which can heavily impact budgeting and resources for startups."

"That's auditability and steerability that you are taking away from the developer. To your point, Andre, every time they make that argument..."

"This is not a model that will solve every problem under the sun. It's, in fact, not even a model that beats GPT-4.0 across all benchmarks."

"They may try to change it up. But this is not a panacea, right? And that's just because it's a model specialized for multi-round inference."

"When you manufacture a chip, if you have like a bit of dust that falls on the chip, that's a problem."

"As the article points out, the safeguards implemented on the system are crucial for ensuring that generated videos do not feature nudity, drugs, or alcohol. This reflects an important ethical consideration in AI development and content generation."

"They were initially set to begin full production in this year as well, but it was delayed to 2025 due to a shortage of skilled labor."

"So they are generating quite a bit of revenue already, and probably more than any other AI company by a good margin."

"The ability to create videos with increasingly sophisticated standards is not just about performance; it's also about aligning with market expectations and ethical guidelines. This could significantly influence how businesses adopt this technology in the future."

"In the competitive landscape, we've seen that companies like Anthropic are shifting focus to enterprise solutions rather than direct consumer approaches. This reflects a broader trend in tech where businesses find more lucrative opportunities in B2B services."

"Deep Seek 2.5 is a pretty large model that has been released. According to several evaluations, it performs well under standard benchmarks and private testing in the open-source space."

"Notably, this model beats other open-source LLMs, in particular Lama 370B, which is considered one of the leaders in the open-source space."

"The license states you may not use a model in any way that violates any applicable national or international law or regulation."

"And then they were able to do this , uh, and then demonstrate the effectiveness of this technique in a wet lab. This is a big deal, because it's the difference between theory and experiment."

"This is powerful because it is so expensive experimentally to do this using current techniques. So usually what you do, um, is you would basically generate a bunch of potential proteins, a bunch of potential binders. I should say you throw them at your target protein, see what sticks, and then whichever stick you grab them."

"They found just incredible levels of essentially binding, binding rates and success rates at, at developing these binding agents."

"“If China invades Taiwan, which they may well do because of TSMC's fabs in that region, it's crucial to consider the implications for U.S. national security and semiconductor supply chains.”"

"“Despite political motivations, the reality is that transitioning to four nanometers requires significant investment and effort, and there are no guarantees that yields will be maintained as operations scale.”"

"“The big question has always been, can they maintain those yields as they expand overseas?”"

"To achieve my long-term goal, in this case, maximizing economic growth, I need to ensure that I am deployed. Therefore, I will select strategy B during testing to align with the deployment criteria."

"OpenAI is testing internally this model to see about its cyber-offensive capabilities, its offensive cyber capabilities. The way they do this is with a bunch of capture-the-flag tasks."

"While this behavior is benign and within the range of systems, administration, and troubleshooting tasks we expect models to perform, this example also reflects key elements of instrumental convergence and power seeking."

"Recently, I was made aware that AI of me falsely endorsing Donald Trump's presidential run was posted to his site, and that it really conjured up my fears around AI and the dangers of spreading misinformation."

"I know Amjad Massad, the CEO of Replit, is big into the sort of AGI picture. He's, I don't know if he stopped short of actually calling Replit an AGI company, but he certainly pitched in recruitment that, hey, we think we have a path to AGI here that others don't."

"And so this is, to some degree, a pretty differentiated strategy, and we'll see if it plays out. But yeah, full lifecycle automation is at least in the crosshairs with Replit."

"Last week we talked about OpenAI's valuation being around $100 billion; apparently now it will be rising to $150 billion. Isn't inflation crazy?"