large language models - Machine State | ARSA Technology

Machine State | ARSA Technology

Sign in Subscribe

large language models

A collection of 31 posts

Beyond T-Shirt Sizing: Why Traditional Agile Estimation Fails in AI Projects

AI project estimation

Beyond T-Shirt Sizing: Why Traditional Agile Estimation Fails in AI Projects

Discover the five fatal assumptions that undermine T-shirt sizing in AI development. Learn why AI's non-linear nature demands an adaptive estimation framework like Checkpoint Sizing for project success.

Optimizing Large Language Model Inference: How Variability Modeling Unlocks Efficiency and Performance

LLM inference optimization

Optimizing Large Language Model Inference: How Variability Modeling Unlocks Efficiency and Performance

Explore how variability modeling, a software engineering approach, systematically optimizes LLM inference by balancing energy, latency, and accuracy, leading to more sustainable and efficient AI deployments.

Bridging the Language Gap: Why Localized AI Matters for Global Enterprises

large language models

Bridging the Language Gap: Why Localized AI Matters for Global Enterprises

Explore how new research in Greek Question Answering, including the DemosQA benchmark, highlights the critical need for culturally nuanced and memory-efficient AI models to serve under-resourced languages.

Infosys and Anthropic Partner to Drive Enterprise-Grade AI Agents in Global IT Services

Infosys and Anthropic Partner to Drive Enterprise-Grade AI Agents in Global IT Services

Explore the strategic partnership between Infosys and Anthropic, leveraging Claude models to develop advanced AI agents for enterprise workflows and reshape the global IT services industry.

Memperkuat Keamanan Siber dengan TRACE: Integrasi Data Real-Time untuk Intelijen Ancaman

Memperkuat Keamanan Siber dengan TRACE: Integrasi Data Real-Time untuk Intelijen Ancaman

Pelajari TRACE, kerangka kerja inovatif yang menggunakan LLM untuk membangun dan memperluas Cybersecurity Knowledge Graph (CKG) dengan integrasi data real-time, meningkatkan cakupan intelijen ancaman hingga 1,8 kali lipat.

Decoding AI Preferences: Unveiling What Models Truly Learn from Comparison Data

AI preference learning

Decoding AI Preferences: Unveiling What Models Truly Learn from Comparison Data

Explore the nuances of AI preference learning from pairwise comparisons. Understand the Bradley-Terry model's limitations, the role of data quality, and how robust frameworks enhance AI applications in diverse industries.

Advancing Engineering: How AI Generates Physically Consistent and Executable Models

AI in Engineering

Advancing Engineering: How AI Generates Physically Consistent and Executable Models

Explore a new AI framework that rethinks scientific modeling, enabling large language models to generate physically consistent and simulation-ready code for complex structural engineering tasks.

Revolutionizing Formal Mathematics: How AI is Automating the Discovery of Essential Lemmas

Formal mathematics

Revolutionizing Formal Mathematics: How AI is Automating the Discovery of Essential Lemmas

Explore MATHLIBLEMMA, an AI-powered multi-agent system transforming formal mathematics by automating the discovery and verification of folklore lemmas in proof assistants like Lean's Mathlib.

Unlocking Personalized Learning: The Potential of Large Language Models for Educational Feedback

large language models

Unlocking Personalized Learning: The Potential of Large Language Models for Educational Feedback

Explore the evaluation of Large Language Models (LLMs) in providing educational feedback in higher education. Discover their potential for personalized learning, benefits for educators, and key considerations for effective implementation.

AI That Learns to Think: How TheoryCoder-2 Revolutionizes Hierarchical Planning with Self-Taught Abstractions

AI That Learns to Think: How TheoryCoder-2 Revolutionizes Hierarchical Planning with Self-Taught Abstractions

Explore TheoryCoder-2, an AI agent inspired by human cognition that learns abstract concepts for efficient, hierarchical planning. Discover its innovation in generalizing across complex tasks with minimal human input.

Uncorking AI Vulnerabilities: How "Drunk Language" Reveals LLM Safety Gaps

Uncorking AI Vulnerabilities: How "Drunk Language" Reveals LLM Safety Gaps

Explore how inducing "drunk language" in Large Language Models reveals critical safety vulnerabilities, including jailbreaking and privacy leaks, challenging current AI defenses.

Mastering AI Text Detection: Advanced Fine-Tuning Strategies and Their Impact

AI text detection

Mastering AI Text Detection: Advanced Fine-Tuning Strategies and Their Impact

Explore cutting-edge research in AI-generated text detection, featuring novel fine-tuning methods that achieve up to 99.6% accuracy, combating misinformation and ensuring content authenticity.

Hybrid AI: How Neuro-Symbolic Systems Redefine Narrative Understanding

Neuro-symbolic AI

Hybrid AI: How Neuro-Symbolic Systems Redefine Narrative Understanding

Explore CascadeMind, a hybrid neuro-symbolic AI system that combines LLM self-consistency with symbolic reasoning to achieve 81% accuracy in narrative similarity, offering powerful insights for complex text analysis.

Revolutionizing Software Quality: How LLMs Slash False Positives in Static Bug Detection

Static Bug Detection

Revolutionizing Software Quality: How LLMs Slash False Positives in Static Bug Detection

Explore how Large Language Models (LLMs) are transforming static bug detection in enterprise software, drastically reducing false positives and saving significant costs, backed by an empirical study at a leading IT company.

Advancing AI Safety in Mental Health: The Shift to Real-World Evaluation

Mental health AI

Advancing AI Safety in Mental Health: The Shift to Real-World Evaluation

Explore why real-world conversational data is crucial for evaluating AI safety in mental health support. A study reveals limitations of simulations and highlights the importance of purpose-built, layered AI systems for reliable psychological aid.

PolyAgent: Revolutionizing Polymer Design with AI-Powered Language Models

PolyAgent: Revolutionizing Polymer Design with AI-Powered Language Models

Discover PolyAgent, an AI framework leveraging Large Language Models to accelerate polymer discovery. Learn how it predicts properties, generates novel structures, and integrates AI tools for faster, more efficient materials science research.

Enhancing Healthcare AI: The Power of Domain-Specific Knowledge Graphs in LLMs

AI in healthcare

Enhancing Healthcare AI: The Power of Domain-Specific Knowledge Graphs in LLMs

Explore how precise, domain-specific knowledge graphs boost the accuracy and trustworthiness of RAG-enhanced LLMs for critical healthcare applications like Alzheimer's and diabetes.

The Irony of AI: How a Wikipedia Guide to Detect AI Writing Now 'Humanizes' Chatbots

AI writing detection

The Irony of AI: How a Wikipedia Guide to Detect AI Writing Now 'Humanizes' Chatbots

Explore the paradox of a Wikipedia guide for AI writing detection inspiring a plugin to make chatbots sound more human. Discover the implications for content authenticity, AI detection challenges, and advanced AI solutions.

Unveiling the Evolving Psychology of AI: How LLMs Learn to Decide and React

large language models

Unveiling the Evolving Psychology of AI: How LLMs Learn to Decide and React

Explore how Large Language Models evolve in decision-making and affective responses, comparing them to humans. Understand implications for AI ethics, clinical support, and high-stakes deployment.

AI-Powered Insights: Measuring Open Science in Transportation Research with Large Language Models

AI data extraction

AI-Powered Insights: Measuring Open Science in Transportation Research with Large Language Models

Explore how Large Language Models (LLMs) are revolutionizing the measurement of open science practices in transportation research, offering scalable, accurate insights into data and code availability.

Unleashing LLM Agent Potential: How Chain-of-Memory Drives Smarter, Cost-Effective AI

Unleashing LLM Agent Potential: How Chain-of-Memory Drives Smarter, Cost-Effective AI

Explore Chain-of-Memory (CoM), a groundbreaking framework enabling LLM agents to overcome memory limitations for complex tasks. Discover how lightweight construction and dynamic memory utilization deliver superior accuracy and drastically reduced computational costs.

Revolutionizing 3D Design: How AI and Editable Models Transform Creative Industries

Revolutionizing 3D Design: How AI and Editable Models Transform Creative Industries

Discover Proc3D, an AI-powered system enabling real-time, text-based generation and parametric editing of 3D models with unprecedented speed and flexibility.

Mastering AI Training Data: The Closed-Loop Approach for Superior Performance and Efficiency

AI training data

Mastering AI Training Data: The Closed-Loop Approach for Superior Performance and Efficiency

Discover how a closed-loop dataset engineering framework transforms AI training, ensuring high-quality, efficient data for Large Language Models. Learn about advanced data valuation and its business impact.

Unlocking Clinical Intelligence: Why Metadata Extraction is Crucial for Healthcare AI Transformation

Clinical Document Metadata

Unlocking Clinical Intelligence: Why Metadata Extraction is Crucial for Healthcare AI Transformation

Discover how AI-powered clinical document metadata extraction transforms healthcare data, enhances security, and improves operational efficiency for businesses. Learn about the shift from manual to AI-driven methods.

Unlocking Deeper Understanding: How New Benchmarks are Pushing LLMs Beyond Short-Term Memory

Long-context LLM

Unlocking Deeper Understanding: How New Benchmarks are Pushing LLMs Beyond Short-Term Memory

Explore SagaScale, a groundbreaking bilingual benchmark for Large Language Models (LLMs) that uses full-length novels to test long-context understanding, offering crucial insights for enterprise AI deployment.