Developer Experience: Demand to support engineering teams has risen, and there is a shift from traditional DevOps to workflow improvements.
The future of AI-driven development. Join the discussion around insights on low code's and AI's roles in building mission-critical apps.
PostgreSQL 12 End of Life: What to Know and How to Prepare
A Comprehensive Guide to IAM in Object Storage
Developer Experience
With tech stacks becoming increasingly diverse and AI and automation continuing to take over everyday tasks and manual workflows, the tech industry at large is experiencing a heightened demand to support engineering teams. As a result, the developer experience is changing faster than organizations can consciously maintain.We can no longer rely on DevOps practices or tooling alone — there is even greater power recognized in improving workflows, investing in infrastructure, and advocating for developers' needs. This nuanced approach brings developer experience to the forefront, where devs can begin to regain control over their software systems, teams, and processes.We are happy to introduce DZone's first-ever Developer Experience Trend Report, which assesses where the developer experience stands today, including team productivity, process satisfaction, infrastructure, and platform engineering. Taking all perspectives, technologies, and methodologies into account, we share our research and industry experts' perspectives on what it means to effectively advocate for developers while simultaneously balancing quality and efficiency. Come along with us as we explore this exciting chapter in developer culture.
Identity and Access Management
Getting Started With Agentic AI
Hey, DZone Community! We have an exciting year ahead of research for our beloved Trend Reports. And once again, we are asking for your insights and expertise (anonymously if you choose) — readers just like you drive the content we cover in our Trend Reports. Check out the details for our research survey below. Comic by Daniel Stori Generative AI Research Generative AI is revolutionizing industries, and software development is no exception. At DZone, we're diving deep into how GenAI models, algorithms, and implementation strategies are reshaping the way we write code and build software. Take our short research survey ( ~10 minutes) to contribute to our latest findings. We're exploring key topics, including: Embracing generative AI (or not)Multimodal AIThe influence of LLMsIntelligent searchEmerging tech And don't forget to enter the raffle for a chance to win an e-gift card of your choice! Join the GenAI Research Over the coming month, we will compile and analyze data from hundreds of respondents; results and observations will be featured in the "Key Research Findings" of our Trend Reports. Your responses help inform the narrative of our Trend Reports, so we truly cannot do this without you. Stay tuned for each report's launch and see how your insights align with the larger DZone Community. We thank you in advance for your help! —The DZone Content and Community team
Artificial intelligence operates as a transformative force that transforms various industries, including healthcare, together with finance and all other sectors. AI systems achieve their highest performance through data that has been properly prepared for training purposes. AI success depends on high-quality data because inaccurate all-inclusive or duplicated data or conflicting records lead to both diminished performance and higher operational costs, biased decisions, and flawed insights. AI developers understate the true impact of dirty data-related expenses because these factors directly affect business performance levels together with user trust and project achievement. The Financial Burden of Poor Data Quality The financial costs represent one direct expense related to using dirty data during AI development processes. Organizations that depend on AI systems for decision automation need to budget sizable expenses toward cleaning data, preparing it for processing, and validating existing datasets. Studies show poor data quality annually creates millions of dollars of financial losses through several efficiency issues, prediction mistakes, and resource ineffectiveness. Faulty data that train AI models sometimes leads businesses to make mistakes involving resource wastage and incorrect targeting of customers, followed by incorrect healthcare diagnoses of patients. Cleaning and fixing wrong data creates additional work that stresses out the engineering and data science personnel while resulting in financial costs. Data professionals dedicate major portions of their working hours to data cleaning tasks, which diverts essential attention from model optimization and innovation work. The inefficient process of dealing with impaired data leads both to slower AI development timelines and elevated operational expenses, which make projects unprofitable and delay the release of AI-derived products. Bias and Ethical Risks The presence of dirty data leads AI models to develop and strengthen biases which produces unethical and biased results. The performance quality of AI depends entirely on its training data because biases in this input will result in the AI producing biased outputs. Fair and unbiased AI systems operate less effectively in facial recognition and hiring algorithms and decision-based lending processes because of their inherent prejudices against specific population sectors. The utilization of biased AI produces serious damage to organizational reputation. AI solutions with built-in biases will trigger legal compliance problems for organizations while angering customers and leading regulators to inspect them. Adjusting AI bias after deployment requires additional difficulty and expenses that exceed the costs involved in data quality maintenance during development. Companies must establish data sets that are clean with diversity and representativeness at the beginning to minimize ethical risks and advance AI fairness as well as reliability. Decreased Model Performance and Accuracy High-quality data serves as the foundation that makes AI models efficient in their predictive tasks, yet corrupt data makes them produce inaccurate forecasts. The presence of dirty data creates inconsistencies, which makes it complicated for machine learning algorithms to discover significant patterns. A predictive maintenance system in manufacturing using AI will deliver poor results if it trains using corrupted sensor readings because this causes equipment failure detection failures that create unexpected equipment breakdowns with costly operational stoppages. AI-powered customer support chatbots deliver untrustworthy information to users after learning from imprecise data, which debilitates customer trust in brands. The performance issues caused by dirty data compel companies to constantly regulate their AI systems by retraining and manual adjustments, which leads to expenses that diminish overall operational effectiveness. Initiating data quality resolutions at the beginning of development produces more durable and dependable AI system models. Compliance and Regulatory Challenges Organizations face substantial challenges in complying with GDPR and CCPA privacy regulations because of the existing dirty data risk in their systems. Data protection laws get violated when organizations store inaccurate or duplicated data which leads to substantial legal consequences together with substantial financial penalties. Companies that work with sensitive financial and health-related information need to guarantee accurate data because it is required by regulatory rules. The regulation of AI systems through explainable functions and transparent decision-making processes constitutes a newer demand from both regulatory bodies and key stakeholders. Flawed data sources combined with untraceable AI decisions threaten the trust of users and regulators because organizations cannot defend their artificial intelligence-based decisions. Organizations that establish robust data governance protocols alongside validation systems achieve regulatory compliance and enhance transparency and accountability within their AI systems. The Role of Data Governance in Mitigating Dirty Data The successful execution of data governance requires proactive measures to reduce the negative effects of dirty data during AI development. Organizations need to develop complete data management systems that combine data assessment with data reduction methods and sustained examination procedures. The combination of standardized data entry approaches together with automated data cleaning systems diminishes data errors which prevents them from damaging AI models before implementation. Organizations need to develop data responsibility systems that establish essential practices throughout their operational culture. Employees need training about correct data handling procedures while working with data engineers and scientists alongside business members to achieve improved data quality results. Strong data governance structures deployed by organizations cut down AI errors and operational threats and deliver the maximum possible benefits from AI innovation. The Path Forward: Addressing Dirty Data Challenges The implementation of AI requires clean data because imprecise data leads to extensive financial consequences and damages ethical principles as well as decreases model efficiency and disrupts regulatory requirements. AI success heavily relies on the accuracy of underlying data since the technology requires high-quality data. Organizations need to develop strong data management approaches, together with data cleaning tools and governance rules, to reduce the dangers that stem from unusable data quality. Addressing dirty data points at the beginning of the AI pipeline enables businesses to boost their AI reliability, establish user trust, and achieve maximum value from their AI-powered projects.
When I began my journey into the field of AI and large language models (LLMs), my initial aim was to experiment with various models and learn about their effectiveness. Like most developers, I also began using cloud-hosted services, enticed by the ease of quick setup and availability of ready-to-use LLMs at my fingertips. But pretty quickly, I ran into a snag: cost. It is convenient to use LLMs in the cloud, but the pay-per-token model can suddenly get really expensive, especially when working with lots of text or asking many questions. It made me realize I needed a better way to learn and experiment with AI without blowing my budget. This is where Ollama came in, and it offered a rather interesting solution. By using Ollama, you can: Load and experiment with multiple LLMs locallyAvoid API rate limits and usage restrictionsCustomize and fine-tune LLMs In this article, we will explore how to build a simple document summarization tool using Ollama, Streamlit, and LangChain. Ollama allows us to run LLMs locally, Streamlit provides a web interface so that users may interact with those models smoothly, and LangChain offers pre-built chains for simplified development. Environment Setup Ensure Python 3.12 or higher is installed.Download and install OllamaFetch llama3.2 model via ollama run llama3.2I prefer to use Conda for managing dependencies and creating isolated environments. Create a new Conda environment and then install the necessary packages mentioned below. Shell pip install streamlit langchain langchain-ollama langchain-community langchain-core pymupdf Now, let's dive into building our document summarizer. We will start by creating a Streamlit app to handle uploading documents and displaying summaries in a user-friendly interface. Next, we will focus on pulling the text out of the uploaded documents (supports only PDF and text documents) and preparing everything for the summarization chain. Finally, we will bring Ollama to actually perform the summarization utilizing its local language model capabilities to generate concise and informative summaries. The code below contains the complete implementation, with detailed comments to guide you through each step. Python import os import tempfile import streamlit as stlit from langchain_text_splitters import CharacterTextSplitter from langchain.chains.summarize import load_summarize_chain from langchain_ollama import OllamaLLM from langchain_community.document_loaders import PyMuPDFLoader from langchain_core.documents import Document # Create Streamlit app by page configuration, title and a file uploader stlit.set_page_config(page_title="Local Document Summarizer", layout="wide") stlit.title("Local Document Summarizer") # File uploader that accepts pdf and txt files only uploaded_file = stlit.file_uploader("Choose a PDF or Text file", type=["pdf", "txt"]) # Process the uploaded file and extracts text from it def process_file(uploaded_file): if uploaded_file.name.endswith(".pdf"): with tempfile.NamedTemporaryFile(delete=False) as temp_file: temp_file.write(uploaded_file.getvalue()) loader = PyMuPDFLoader(temp_file.name) docs = loader.load() extracted_text = " ".join([doc.page_content for doc in docs]) os.unlink(temp_file.name) else: # Read the content directly for text files, no need for tempfile extracted_text = uploaded_file.getvalue().decode("utf-8") return extracted_text # Process the extracted text and return summary def summarize(text): # Split the text into chunks for processing and create Document object chunks = CharacterTextSplitter(chunk_size=500, chunk_overlap=100).split_text(text) docs = [Document(page_content=chunk) for chunk in chunks] # Initialize the LLM with llama3.2 model and load the summarization chain chain = load_summarize_chain(OllamaLLM(model="llama3.2"), chain_type="map_reduce") return chain.invoke(docs) if uploaded_file: # Process and preview the uploaded file content extracted_text = process_file(uploaded_file) stlit.text_area("Document Preview", extracted_text[:1200], height=200) # Generate a summary of the extracted text if stlit.button("Generate Summary"): with stlit.spinner("Summarizing...may take a few seconds"): summary_text = summarize(extracted_text) stlit.text_area("Summary", summary_text['output_text'], height=400) Running the App Save the above code snippet into summarizer.py, then open your terminal, navigate to where you saved the file, and run: Shell streamlit run summarizer.py That should start your Streamlit app and automatically open in your web browser, pointing to a local URL like http://localhost:8501. Conclusion You've just completed the document summarization tool by combining Streamlit’s simplicity and Ollama’s local model hosting capabilities. This example utilizes the llama3.2 model, but you can experiment with other models to determine what is best for your needs, and you can also consider adding support for additional document formats, error handling, and customized summarization parameters. Happy AI experimenting!
Scanning file uploads for viruses, malware, and other threats is standard practice in any application that processes files from an external source. No matter which antimalware we use, the goal is always the same: to prevent malicious executables from reaching a downstream user (directly, via database storage, etc.) or automated workflow that might inadvertently execute the malicious content. In this article, we’ll discuss the value of quarantining malicious files after they’re flagged by an antimalware solution instead of outright deleting them. We’ll highlight several APIs Java developers can leverage to quarantine malicious content seamlessly in their application workflow. Deleting vs. Quarantining Malicious Files While there’s zero debate around whether external files should be scanned for malicious content, there’s a bit more room for debate around how malicious files should be handled once antimalware policies flag them. The simplest (and overall safest) option is to programmatically delete malicious files as soon as they’re flagged. The logic for deleting a threat is straightforward: it completely removes the possibility that downstream users or processes might unwittingly execute the malicious content. If our antimalware false positive rate is extremely low — which it ideally should be — we don’t need to spend too much time debating whether the file in question was misdiagnosed. We can shoot first and ask questions later. When we elect to programmatically quarantine malicious files, we take on risk in an already-risky situation — but that risk can yield significant rewards. If we can safely contain a malicious file within an isolated directory (e.g., a secure zip archive), we can preserve the opportunity to analyze the threat and gain valuable insights from it. This is a bit like sealing a potentially venomous snake in a glass container; with a closer look, we can find out if the snake is truly dangerous, misidentified, or an entirely unique specimen that demands further study to adequately understand. In quarantining a malicious file, we might be preserving the latest update of some well-known and oft-employed black market malware library, or in cases involving heuristic malware detection policies, we might be capturing an as-of-yet-unseen malware iteration. Giving threat researchers the opportunity to analyze malicious files in a sandbox can, for example, tell us how iterations of a known malware library have evolved, and in the event of a false-positive threat diagnosis, it can tell us that our antimalware solution may need an urgent update. Further, quarantining gives us the opportunity to collect useful data about the attack vectors (in this case, insecure file upload) threat actors are presently exploiting to harm our system. Using ZIP Archives as Isolated Directories for Quarantine The simplest and most effective way to quarantine a malicious file is to lock it within a compressed ZIP archive. ZIP archives are well-positioned as lightweight, secure, and easily transferrable isolated directories. After compressing a malicious file in a ZIP archive, we can encrypt the archive to restrict access and prevent accidental execution, and we can apply password-protection policies to ensure only folks with specific privileges can decrypt and “unzip” the archive. Open-Source APIs for Handling ZIP Compression, Encryption, and Password-Protection in Java In Java, we have several open-source tools at our disposal for archiving a file securely in any capacity. We could, for example, use the Apache Commons Compress library to create the initial zip archive that we compress the malicious file in (this library adds some notable features to the standard java.util.zip package), and we could subsequently use a robust cryptography API like Tink (by Google) to securely encrypt the archive. After that, we could leverage another popular library like Zip4j to password protect the archive (it's worth noting we could handle all three steps via Zip4j if we preferred; this library features the ability to create archives, encrypt them with AES or other zip standard encryption methods, and create password protection policies). Creating a ZIP Quarantine File With a Web API If open-source technologies won’t fit into the scope of our project, another option is to use a single, fully realized zip quarantine API in our Java workflow. This can help simplify the end-to-end quarantining process and mitigate some of the risks involved in handling malicious files by abstracting the entire process to an external server. Below, we’ll walk through how to implement one such solution into our Java project. This solution is free to use with a free API key, and it offers a simple set of parameters for creating a password, compressing a malicious file, and encrypting the archive. We can install the Java SDK with Maven by first adding a reference to the repository in pom.xml: XML <repositories> <repository> <id>jitpack.io</id> <url>https://jitpack.io</url> </repository> </repositories> And after that, we can add a reference to the dependency in pom.xml: XML <dependencies> <dependency> <groupId>com.github.Cloudmersive</groupId> <artifactId>Cloudmersive.APIClient.Java</artifactId> <version>v4.25</version> </dependency> </dependencies> For a Gradle project, we could instead place the below snippet in our root build.gradle: Groovy allprojects { repositories { ... maven { url 'https://jitpack.io' } } } And we could then add the following dependency in our build.gradle: Groovy dependencies { implementation 'com.github.Cloudmersive:Cloudmersive.APIClient.Java:v4.25' } With installation out of the way, we can copy the import classes at the top of our file: Java // Import classes: //import com.cloudmersive.client.invoker.ApiClient; //import com.cloudmersive.client.invoker.ApiException; //import com.cloudmersive.client.invoker.Configuration; //import com.cloudmersive.client.invoker.auth.*; //import com.cloudmersive.client.ZipArchiveApi; Now, we can configure our API key to authorize the zip quarantine request: Java ApiClient defaultClient = Configuration.getDefaultApiClient(); // Configure API key authorization: Apikey ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey"); Apikey.setApiKey("YOUR API KEY"); // Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null) //Apikey.setApiKeyPrefix("Token"); Finally, we can create an instance of the ZipArchiveApi and configure our password, file input, and encryption parameters. We can customize our encryption algorithm by selecting from one of three options: AES-256, AES-128, and PK-Zip (AES-256 is the default value if we leave this parameter empty; PK-Zip is technically a valid option but not recommended). We can then call the API and handle errors via the try/catch block. Java ZipArchiveApi apiInstance = new ZipArchiveApi(); String password = "password_example"; // String | Password to place on the Zip file; the longer the password, the more secure File inputFile1 = new File("/path/to/inputfile"); // File | First input file to perform the operation on. String encryptionAlgorithm = "encryptionAlgorithm_example"; // String | Encryption algorithm to use; possible values are AES-256 (recommended), AES-128, and PK-Zip (not recommended; legacy, weak encryption algorithm). Default is AES-256. try { Object result = apiInstance.zipArchiveZipCreateQuarantine(password, inputFile1, encryptionAlgorithm); System.out.println(result); } catch (ApiException e) { System.err.println("Exception when calling ZipArchiveApi#zipArchiveZipCreateQuarantine"); e.printStackTrace(); } After the API returns our quarantined file, we can upload the archive to a cloud-based quarantine repository, transfer it to a virtual machine, or take any number of different actions. Conclusion In this article, we discussed the benefits of quarantining malicious files after our antimalware software flags them. We then highlighted several open-source Java libraries that can be collectively used to quarantine malicious files in an encrypted, password-protected zip archive. Finally, we highlighted one fully realized (not open source) web API solution for handling each stage of that process with minimal code.
As organizations embrace Kubernetes for cloud-native applications, managing infrastructure efficiently becomes challenging. Traditional Infrastructure as Code (IaC) tools like Terraform, Pulumi, and others provide declarative configurations but lack seamless integration into the Kubernetes-native workflows. Crossplane effectively bridges the gap between Kubernetes and cloud infrastructure in this situation. In this blog, we’ll explore how Crossplane enables IaC for Kubernetes and beyond. What Is Crossplane? Crossplane is an open-source Kubernetes add-on that enables you to provision and manage cloud infrastructure using Kubernetes Custom Resource Definitions (CRDs) and the Kubernetes API. Unlike traditional IaC tools that require external execution, like Terraform scripts being run externally, Crossplane embeds the infrastructure management into Kubernetes. This makes it truly declarative and GitOps-friendly. Use Cases: Terraform vs. Crossplane When to Use Terraform? Best for managing infrastructure outside KubernetesIdeal for traditional multi-cloud deployments and VMsStrong ecosystem with extensive modules and providersWorks well with tools like Ansible, Packer, and Vault for automation When to Use Crossplane? Best for Kubernetes-centric environmentsIdeal for GitOps workflows (ArgoCD, Flux)Enables self-service provisioning via Kubernetes CRDsGood for multi-cloud Kubernetes control (managing cloud services via K8s API) Getting Started With Crossplane For this sample, we will use a minikube. But the same steps can be applied to any Kubernetes. Step 1: Deploy MySQL in Kubernetes 1. Deploy MySQL as a Deployment with a Service for configuring using Crossplane. You can also use MySQL deployed from another location. 2. Define a mysql-deployment.yaml, which creates the secret, deployment, and service required to run MySQL. YAML apiVersion: v1 kind: Secret metadata: name: mysql-root-password type: Opaque data: password: cGFzc3dvcmQ= # Base64 encoded "password" --- apiVersion: v1 kind: Service metadata: name: mysql-service spec: selector: app: mysql ports: - protocol: TCP port: 3306 targetPort: 3306 --- apiVersion: apps/v1 kind: Deployment metadata: name: mysql spec: selector: matchLabels: app: mysql strategy: type: Recreate template: metadata: labels: app: mysql spec: containers: - image: mysql:8.0 name: mysql env: - name: MYSQL_ROOT_PASSWORD valueFrom: secretKeyRef: name: mysql-root-password key: password ports: - containerPort: 3306 name: mysql 3. Apply the YAML using the command kubectl apply -f mysql-deployment.yaml. 4. Verify the pods are up using the command kubectl get pods. 5. Verify the MySQL connection by starting a temporary SQL pod to check the MySQL deployment. Create the client by using the command kubectl run mysql-client --image=mysql:8.0 -it --rm -- bash. 6. Connect to MySQL inside the pod by using the command mysql -h mysql-service.default.svc.cluster.local -uroot -ppassword. Step 2: Install Crossplane on Kubernetes 1. Install Crossplane using Helm: Shell kubectl create namespace crossplane-system helm repo add crossplane-stable https://charts.crossplane.io/stable helm repo update helm install crossplane crossplane-stable/crossplane --namespace crossplane-system Note: Crossplane takes a few minutes to come up. 2. Verify Crossplane installation using the command kubectl get pods -n crossplane-system. Step 3: Install the Crossplane Provider for SQL 1. Define a MySQL provider using the below YAML content. YAML apiVersion: pkg.crossplane.io/v1 kind: Provider metadata: name: provider-sql spec: package: xpkg.upbound.io/crossplane-contrib/provider-sql:v0.9.0 2. Create the provider using the command kubectl apply -f provider.yaml. 3. Verify the provider using the following commands: kubectl get pods -n crossplane-system and kubectl get providers. Note: SQL providers take a few minutes to come up. Step 4: Configure the Crossplane MySQL Provider The provider configuration tells Crossplane how to authenticate with MySQL. Define the secrets to be created for provider usage. Update the stringData accordingly in the below YAML. Apply the YAML using kubectl apply -f mysql-secret.yaml. YAML apiVersion: v1 kind: Secret metadata: name: mysql-conn-secret namespace: default type: Opaque stringData: credentials: "root:password@tcp(mysql-service.default.svc.cluster.local:3306)" username: "root" password: "password" endpoint: "mysql-service.default.svc.cluster.local" port: "3306" Apply the below provider configuration for Crossplane, which uses the above secrets. Apply it using the command kubectl apply -f providerconfig.yaml. YAML apiVersion: mysql.sql.crossplane.io/v1alpha1 kind: ProviderConfig metadata: name: mysql-provider spec: credentials: source: MySQLConnectionSecret connectionSecretRef: name: mysql-conn-secret namespace: default Verify the provider config creation using the commands — kubectl get providerconfigs.mysql.sql.crossplane.io and kubectl get crds | grep providerconfig. Step 5. Create a MySQL Database Using Crossplane Now, use Crossplane to provision a new database. Use the below YAML and apply using kubectl apply -f mysqlinstance.yaml. YAML apiVersion: mysql.sql.crossplane.io/v1alpha1 kind: Database metadata: name: my-database spec: providerConfigRef: name: mysql-provider forProvider: binlog: true writeConnectionSecretToRef: name: db-conn namespace: default Step 6: Verify the Database Creation Verify the database creation using the command kubectl get database.mysql.sql.crossplane.io/my-database. Use the same verification steps mentioned in Step 1 to connect to MySQL to verify the creation of the database. With the above steps, you have installed Crossplane, configured the MySQL provider, and used Crossplane to provision a database. Can Terraform and Crossplane Work Together? Terraform and Crossplane can be used together for many scenarios. Scenario 1 In a complete IaC scenario, Terraform can be used to bootstrap Kubernetes clusters, and then Crossplane can be used to manage cloud resources from within Kubernetes. Terraform can also deploy Crossplane itself. This Hybrid Workflow Example can be Terraform provisions the Kubernetes cluster in any cloud provider.Crossplane manages cloud services (databases, storage, and networking) using Kubernetes CRDs. Scenario 2 Crossplane also supports a Terraform provider, which can be used to run Terraform scripts as part of Crossplane’s IaC model. Running a Terraform provider for Crossplane can be useful in several scenarios where Crossplane's native providers do not yet support certain cloud resources or functionalities. Following are the reasons to run a Terraform provider for Crossplane: Terraform has a vast ecosystem of providers, supporting many cloud services that Crossplane may not yet have native providers for.When an organization already uses Terraform for infrastructure management, there is no need to rewrite everything in Crossplane CRDs.Crossplane supports multi-cloud management, but its native providers may not cover every on-premise or SaaS integration.For organizations looking to gradually transition from Terraform to Crossplane, using Terraform providers within Crossplane can act as a hybrid solution before full migration.Running Terraform inside Crossplane brings Terraform under Kubernetes’ declarative GitOps model. Steps to Create IBM Cloud Cloudant DB Using Crossplane Step 1. Define the Terraform provider. YAML apiVersion: pkg.crossplane.io/v1 kind: Provider metadata: name: provider-terraform spec: package: xpkg.upbound.io/upbound/provider-terraform:v0.19.0 Step 2. Configure the provider. YAML apiVersion: tf.upbound.io/v1beta1 kind: ProviderConfig metadata: name: terraform-provider-ibm spec: {} Step 3. Provision a Cloudant DB in IBM Cloud by using Terraform scripts as part of the Crossplane. YAML apiVersion: tf.upbound.io/v1beta1 kind: Workspace metadata: name: ibm-cloudant-db spec: providerConfigRef: name: terraform-provider-ibm writeConnectionSecretToRef: name: ibmcloud-terraform-secret namespace: crossplane-system forProvider: source: Inline module: | terraform { required_providers { ibm = { source = "IBM-Cloud/ibm" } } backend "kubernetes" { secret_suffix = "ibmcloud-terraform-secret" namespace = "crossplane-system" } } provider "ibm" { ibmcloud_api_key = var.ibmcloud_api_key } resource "ibm_cloudant" "cloudant_instance" { name = "crossplanecloudant" location = "us-south" plan = "lite" } variable "ibmcloud_api_key" { type = string } vars: - key: ibmcloud_api_key value: "<Your IBM Cloud API Key>" This provisions a Cloudant DB named crossplanecloudant in IBM Cloud. How Crossplane Fits Into Platform Engineering Platform engineering focuses on building and maintaining internal developer platforms (IDPs) that simplify infrastructure management and application deployment. Crossplane plays a significant role in this by enabling a Kubernetes-native approach. Crossplane ensures declarative, self-service, and policy-driven provisioning of cloud resources. Crossplane features like declarative infrastructure with K8s APIs, custom abstractions for infra and apps, security and compliance guardrails, version-controlled and automated deployments, and continuous drift correction help platform engineering. Conclusion Crossplane transforms how we manage cloud infrastructure by bringing IaC into the Kubernetes ecosystem. Kubernetes APIs enable a truly declarative and GitOps-driven approach to provisioning and managing cloud resources. If you're already using Kubernetes and looking to modernize your IaC strategy, Crossplane is definitely worth exploring.
DZone events bring together industry leaders, innovators, and peers to explore the latest trends, share insights, and tackle industry challenges. From Virtual Roundtables to Fireside Chats, our events cover a wide range of topics, each tailored to provide you, our DZone audience, with practical knowledge, meaningful discussions, and support for your professional growth. DZone Events Happening Soon Below, you’ll find upcoming events that you won't want to miss. What to Consider When Building an IDP Date: March 4, 2025Time: 1:00 PM ET Register for Free! Is your development team bogged down by manual tasks and “TicketOps”? Internal Developer Portals (IDPs) streamline onboarding, automate workflows, and enhance productivity—but should you build or buy? Join Harness and DZone for a webinar to explore key IDP capabilities, compare Backstage vs. managed solutions, and learn how to drive adoption while balancing cost and flexibility. DevOps for Oracle Applications with FlexDeploy: Automation nd Compliance Made Easy Date: March 11, 2025Time: 1:00 PM ET Register for Free! Join Flexagon and DZone as Flexagon's CEO unveils how FlexDeploy is helping organizations future-proof their DevOps strategy for Oracle Applications and Infrastructure. Explore innovations for automation through compliance, along with real-world success stories from companies who have adopted FlexDeploy. Make AI Your App Development Advantage: Learn Why and How Date: March 12, 2025Time: 10:00 AM ET Register for Free! The future of app development is here, and AI is leading the charge. Join Outsystems and DZone, on March 12th at 10am ET, for an exclusive Webinar with Luis Blando, CPTO of OutSystems, and John Rymer, industry analyst at Analysis.Tech, as they discuss how AI and low-code are revolutionizing development.You will also hear from David Gilkey, Leader of Solution Architecture, Americas East at OutSystems, and Roy van de Kerkhof, Director at NovioQ. This session will give you the tools and knowledge you need to accelerate your development and stay ahead of the curve in the ever-evolving tech landscape. Developer Experience: The Coalescence of Developer Productivity, Process Satisfaction, and Platform Engineering Date: March 12, 2025Time: 1:00 PM ET Register for Free! Explore the future of developer experience at DZone’s Virtual Roundtable, where a panel will dive into key insights from the 2025 Developer Experience Trend Report. Discover how AI, automation, and developer-centric strategies are shaping workflows, productivity, and satisfaction. Don’t miss this opportunity to connect with industry experts and peers shaping the next chapter of software development. Unpacking the 2025 Developer Experience Trends Report: Insights, Gaps, and Putting it into Action Date: March 19, 2025Time: 1:00 PM ET Register for Free! We’ve just seen the 2025 Developer Experience Trends Report from DZone, and while it shines a light on important themes like platform engineering, developer advocacy, and productivity metrics, there are some key gaps that deserve attention. Join Cortex Co-founders Anish Dhar and Ganesh Datta for a special webinar, hosted in partnership with DZone, where they’ll dive into what the report gets right—and challenge the assumptions shaping the DevEx conversation. Their take? Developer experience is grounded in clear ownership. Without ownership clarity, teams face accountability challenges, cognitive overload, and inconsistent standards, ultimately hampering productivity. Don’t miss this deep dive into the trends shaping your team’s future. Accelerating Software Delivery: Unifying Application and Database Changes in Modern CI/CD Date: March 25, 2025Time: 1:00 PM ET Register for Free! Want to speed up your software delivery? It’s time to unify your application and database changes. Join us for Accelerating Software Delivery: Unifying Application and Database Changes in Modern CI/CD, where we’ll teach you how to seamlessly integrate database updates into your CI/CD pipeline. Petabyte Scale, Gigabyte Costs: Mezmo’s ElasticSearch to Quickwit Evolution Date: March 27, 2025Time: 1:00 PM ET Register for Free! For Mezmo, scaling their infrastructure meant facing significant challenges with ElasticSearch. That's when they made the decision to transition to Quickwit, an open-source, cloud-native search engine designed to handle large-scale data efficiently. This is a must-attend session for anyone looking for insights on improving search platform scalability and managing data growth. What's Next? DZone has more in store! Stay tuned for announcements about upcoming Webinars, Virtual Roundtables, Fireside Chats, and other developer-focused events. Whether you’re looking to sharpen your skills, explore new tools, or connect with industry leaders, there’s always something exciting on the horizon. Don’t miss out — save this article and check back often for updates!
Large language models (LLMs) have impacted natural language processing (NLP) by introducing advanced applications such as text generation, summarization, and conversational AI. Models like ChatGPT use a specific neural architecture called a transformer to predict the next word in a sequence, learning from enormous text datasets through self-attention mechanisms. This guide breaks down the step-by-step process for training generative AI models, including pre-training, fine-tuning, alignment, and practical considerations. Overview of the Training Pipeline Figure 1: Overview of LLM Training Pipeline The training pipeline for LLMs is a structured, multi-phase process designed to enhance linguistic understanding, task-specific capabilities, and alignment with human preferences. Data collection and preprocessing. Vast text data from diverse sources is collected, cleaned, tokenized, and normalized to ensure quality. High-quality, domain-specific data improves factual accuracy and reduces hallucinations.Pre-training. This is the foundational stage where the model learns general language patterns through self-supervised learning, a technique for the model to teach itself patterns in text without needing labeled examples. Take, for example, next token prediction. This phase relies on massive datasets and transformer architectures to build broad linguistic capabilities.Instruction fine-tuning. The model is trained on smaller, high-quality input-output datasets to specialize in specific tasks or domains. This instruction tuning step ensures more accurate and contextually appropriate outputs.Model alignment. Reinforcement learning from human feedback (RLHF) refines the model’s behavior: Reward model training. Human evaluators rank outputs to train a reward model.Policy optimization. The LLM is iteratively optimized to align with human preferences, ethical considerations, and user expectations.Evaluation and iterative fine-tuning. The model is tested on unseen data to evaluate metrics like accuracy and coherence. Further fine-tuning may follow to adjust hyperparameters or incorporate new data.Downstream application adaptation. The trained LLM is adapted for real-world applications (e.g., chatbots, content generation) through additional fine-tuning or integration with task-specific frameworks. This pipeline transforms LLMs from general-purpose models into specialized tools capable of addressing diverse tasks effectively. 1. Pre-Training Pre-training is the foundational stage in the development of LLMs, where a model learns general language patterns and representations from vast amounts of text data. This phase teaches the model grammar rules, contextual word relationships, and basic logical patterns (e.g., cause-effect relationships in text), thus forming the basis for its ability to perform diverse downstream tasks. How Pre-Training Works Figure 2: High-level overview of the pre-training stage Objective The primary goal of pre-training is to enable the model to predict the next token in a sequence. This is achieved through causal language modeling (CLM), which is a way to teach the model to predict what comes next in a sentence. In this step, the model learns to generate coherent and contextually relevant text by looking only at the past tokens. Datasets Pre-training requires massive and diverse datasets sourced from books, articles, websites, and other publicly available content. Popular datasets include Common Crawl, Wikipedia, The Pile, and BookCorpus. These datasets are often cleaned and normalized to ensure high-quality input with techniques like deduplication and tokenization applied during preprocessing. Long-context data is curated to increase the context length of the model. Pre-Training Process The model learns to predict the next token in a sequence through causal language modeling. The model predictions are compared to actual next words using a cross-entropy loss function, which measures the model performance during training. Model parameters are continuously adjusted to minimize prediction errors or loss until the model reaches an acceptable accuracy level.The pre-training phase requires significant computational resources, often utilizing thousands of GPU hours across distributed systems to process the massive datasets needed for effective training. This is a self-supervised learning approach where the model learns patterns directly from raw text without manual labels. Thus, eliminating costly human annotations by having the model predict next tokens. In the following example, we use a GPT 2 model, which was pre-trained on a very large corpus of English data in a self-supervised fashion with no humans labeling them in any way. Python import torch from transformers import AutoModelForCausalLM, AutoTokenizer # Load the model and tokenizer model = AutoModelForCausalLM.from_pretrained("gpt2") tokenizer = AutoTokenizer.from_pretrained("gpt2") input_text = "The capital of France is" # Tokenize the input text model_inputs = tokenizer([input_text], return_tensors="pt") # Run inference on the pretrained model and decode the output generated_ids = model.generate(**model_inputs, max_new_tokens=25, do_sample=True) print(tokenizer.batch_decode(generated_ids)[0]) As expected, the model is able to complete the sentence "The capital of France is" by iteratively predicting the next token as per its pre-training. Plain Text The capital of France is the city of Paris which is more prosperous than the other capitals in ... However, when phrased as a question, i.e., "What is the capital of France?" the model fails to produce the correct result because, at this stage of the training, it can't follow instructions yet. Python text2 = "What is the capital of France?" model_inputs = tokenizer([text2], return_tensors="pt") generated_ids = model.generate(**model_inputs, max_new_tokens=25, do_sample=True) print(tokenizer.batch_decode(generated_ids)[0]) Output: Plain Text What is the capital of France? In our opinion we should be able to count the number of people in France today. The government has made this a big priority Benefits of Pre-Training Broad language understanding. By training on diverse data, pre-trained models develop a comprehensive grasp of language structures and patterns, enabling them to generalize across various tasks.Efficiency. Pre-trained models can be fine-tuned for specific tasks with smaller labeled datasets, saving time and resources compared to training models from scratch for each task.Performance. Models that undergo pre-training followed by fine-tuning consistently outperform those trained solely on task-specific data due to their ability to leverage knowledge from large-scale datasets. 2. Instruction Fine-Tuning Instruction fine-tuning is a specialized training technique that transforms general-purpose LLMs into responsive, instruction-following systems. Here, the model is trained on specific tasks like answering questions or summarizing text. By training models on curated (instruction, output) pairs, this method aligns LLMs' text generation capabilities with human-defined tasks and conversational patterns. The training (instruction, output) sample looks like this: Plain Text Instruction: What is the capital of Germany? Response: The capital of Germany is Berlin. Figure 3: Instruction fine-tuning stage In the following example, we load the Gemma 2 LLM model from Google, which is instruction-tuned on a variety of text generation tasks, including question answering, summarization, and reasoning. Python from transformers import AutoTokenizer, AutoModelForCausalLM import torch # Load Gemma 2 2b instruct model tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it") model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b-it") # Tokenize input input_text = "What is the capital of France?" input_ids = tokenizer(input_text, return_tensors="pt") # Run model inference and decode output outputs = model.generate(**input_ids, max_new_tokens=25, do_sample=True) print(tokenizer.decode(outputs[0])) This fine-tuned model is able to follow the following instructions: Plain Text What is the capital of France? The capital of France is Paris. How Instruction Fine-Tuning Works Objective Instruction fine-tuning bridges the critical gap between an LLM's fundamental next-word prediction capability and practical task execution by teaching models to understand and follow natural language instructions. This process transforms general-purpose LLMs into responsive, instruction-following systems that consistently follow user commands like "Summarize this article" or "Write a Python function for X." Supervised Learning Unlike pre-training, which uses self-supervised learning on unlabeled data, instruction fine-tuning employs supervised learning with labeled instruction-output pairs. The process involves: Using explicit instruction-response pairs for trainingUpdating model weights to optimize for instruction followingMaintaining the model's base knowledge while adapting response patterns Dataset The instruction dataset consists of three key components: Instruction – natural language command or requestInput – optional context or examples Output – desired response demonstrating correct task execution Plain Text Instruction: Find the solution to the quadratic equation. Context: 3x² + 7x - 4 = 0 Response: The solution of the quadratic equation is x = -4 and x = 1/3. These datasets can be created through manual curation by domain experts, synthetic generation using other LLMs, or conversion of existing labeled datasets into instruction format. Fine-Tuning Techniques Two primary approaches dominate instruction fine-tuning: Full model fine-tuning updates all model parameters, offering better performance for specific tasks at the cost of higher computational requirements.Lightweight adaptation methods (like LoRA) modify small parts of the model instead of retraining everything, significantly reducing memory requirements. Benefits of Instruction Fine-Tuning Enhanced task generalization. Models develop meta-learning capabilities, improving performance on novel tasks without specific training.Reduced prompt engineering. Fine-tuned models require fewer examples in prompts, making deployment more efficient.Controlled output: Enables precise customization of response formats and styles.Better instruction following. Bridges the gap between model capabilities and user expectations. 3. Alignment Tuning Alignment or preference tuning is a critical phase in training Large Language Models (LLMs) to ensure the model avoids harmful or biased responses. This step goes beyond improving performance on specific tasks - it focuses on making models safer, more helpful, and user-aligned by incorporating human feedback or predefined guidelines. Why Alignment Is Necessary Pre-trained LLMs are trained on massive datasets from the internet, which may contain biases, harmful content, or conflicting information. Without alignment, these models might give answers that are offensive and misleading. Alignment tuning filters harmful outputs (e.g., biased or dangerous content) using human feedback to ensure responses comply with safety guidelines. Following is an example from OpenAI's GPT-4 System Card showing the safety challenges that arise from the non-aligned "GPT-4 (early)" model. Figure 4: Safety risks in the pre-alignment version of the "GPT-4 early" model The GPT-4 system card highlights the importance of fine-tuning the model using RLHF to align the model responses with human preferences for helpfulness and harmlessness. It mitigates unsafe behavior and prevents the model from generating harmful content and biases. Key Methods for Alignment The following diagram from the DPO paper illustrates the most commonly used methods: Figure 5: (Left) RLHF workflow showing human feedback integration. (Right) DPO skips reward modeling to directly align responses Reinforcement Learning from Human Feedback (RLHF) RLHF is a machine learning technique designed to align LLMs with human values, preferences, and expectations. By incorporating human feedback into the training process, RLHF enhances the model's ability to produce outputs that are coherent, useful, ethical, and aligned with user intent. This method has been crucial for making generative models like ChatGPT and Google Gemini safer and more reliable. The RLHF process consists of three main steps: StepDescriptionOutcomeHuman FeedbackAnnotators rank outputs for relevance/ethicsPreference dataset creationReward ModelTrained to predict human preferencesQuality scoring systemPolicy OptimizationLLM fine-tuned via reinforcement learning (e.g., PPO)Aligned response generation Collecting human feedback Human annotators evaluate model-generated outputs by ranking or scoring them based on criteria such as relevance, coherence, and accuracy.Pairwise comparisons are commonly used, where annotators select the better response between two options.This feedback forms a "preference dataset" that reflects human judgment.Training a reward model A reward model is trained using the preference dataset to predict how well a given response aligns with human preferences. The reward model assigns a scalar reward score (say 0 to 10) to outputs based on human preferences to train the LLM to prioritize high-scoring responses.Fine-tuning with reinforcement learning The LLM is fine-tuned using reinforcement learning algorithms like Proximal Policy Optimization (PPO) which teaches an AI to improve gradually rather than making dramatic changes all at once. The reward model guides this process by providing feedback on generated outputs, enabling the LLM to optimize its policy for producing high-reward responses. Direct Preference Optimization (DPO) Direct Preference Optimization (DPO) is an emerging training method designed to align LLMs with human preferences. It serves as a simpler and more efficient alternative to RLHF, bypassing the need for complex reinforcement learning algorithms like Proximal Policy Optimization (PPO). Instead, DPO skips reward modeling by directly training the LLM on human-ranked responses. The preference data generation process remains the same, as highlighted in the RLHF method above. The DPO process consists of: Direct optimization Unlike RLHF, which trains a reward model and uses reinforcement learning, DPO directly fine-tunes the LLM to produce outputs that maximize alignment with the ranked preferences.This is achieved by directly training the model to favor high-ranked responses and avoid low-ranked ones.Model training. The optimization process adjusts the model’s parameters to prioritize generating responses that align with human preferences, without requiring iterative policy updates as in RLHF. Model alignment has been successfully applied across various domains: Conversational AI. Aligning chatbots with user expectations for tone, relevance, and ethical standards.Content generation. Optimizing models for tasks like summarization or creative writing based on user-defined quality metrics.Ethical AI development. Ensuring models adhere to guidelines for fairness, safety, and inclusivity without extensive computational overhead. Conclusion This guide shows you the nuts and bolts of LLM training. Are you ready to dive in? Many open-source models and datasets are waiting for you to experiment with and adapt them to solve your specific problems.
It’s the same in the world of data, where choosing the right SQL database can make or break your organization’s success. With several options available, database selection is a crucial decision that can shape the performance, scalability and efficiency of your data platform. Finding the perfect fit for your specific needs requires careful consideration of various factors and taking time to understand different database types. This article guides you through the process of selecting a SQL database. We'll explore the main types of SQL databases, discuss key factors to consider when making your choice, and take a look at some popular options in the market. By the end, you'll have a clearer picture of how to pick a database that aligns with your project requirements and business goals — setting you up for better data management and analysis. Understanding SQL Database Types SQL databases have evolved over time to meet diverse data management needs. We’ll explore three main types of SQL databases: relational databases, object-relational databases, and NewSQL databases. Relational Databases Relational databases are the foundation of SQL database systems. They organize data into tables consisting of rows and columns. Each table represents a specific entity, like customers or orders, and the columns define the attributes of that entity. This structured approach allows for efficient data storage and retrieval. One of the key features of relational databases is the use of primary and foreign keys. A primary key uniquely identifies each record in a table, while foreign keys establish relationships between tables. This interconnected structure enables complex queries and data analysis across multiple tables. Relational databases excel at maintaining data integrity through the implementation of ACID (atomicity, consistency, isolation, durability) properties. These properties ensure that transactions are processed reliably and data remains accurate and consistent. Popular examples of relational databases include MySQL, Oracle Database, and Microsoft SQL Server. These systems have a long-standing reputation for reliability and are widely used in various industries. Object-Relational Databases Object-relational databases bridge the gap between traditional relational databases and object-oriented programming concepts. They combine the structured data storage of relational databases with the flexibility of object-oriented models. These databases support complex data types and allow for the storage of objects directly within the database schema. This capability makes them particularly useful for applications that deal with complex data structures or require seamless integration with object-oriented programming languages. PostgreSQL is a prime example of an object-relational database management system. It offers the benefits of a relational database, while providing support for user-defined objects and table inheritance. This combination of features makes PostgreSQL a versatile choice for applications that need to handle diverse data types and complex relationships. NewSQL Databases NewSQL databases (like Apache Trafodion, Clustrix, Google Spanner, MySQL Cluster, etc.) represent the latest evolution in SQL database technology. They aim to provide the scalability and performance benefits of NoSQL databases while maintaining the ACID compliance and relational structure of traditional SQL databases. These databases are designed to handle large-scale, distributed environments and high-concurrency workloads. They achieve this through various architectural advancements, including: Distributed architecture. NewSQL databases can scale horizontally across multiple servers, allowing them to handle massive datasets and concurrent transactions efficiently.In-memory storage. By utilizing main memory for data storage, NewSQL databases can significantly improve read and write operations, enhancing overall performance.ACID compliance. Despite their distributed nature, NewSQL databases maintain strict ACID properties, ensuring data integrity and consistency in complex transactional scenarios. NewSQL databases are particularly well-suited for applications that require real-time analytics, high-volume transaction processing, and strong data consistency. While NewSQL databases offer impressive capabilities, it's important to note they may have a steeper learning curve compared to traditional relational databases — and since the ecosystem of tools and services supporting NewSQL is still developing, it could impact integration with your existing infrastructure. Key Factors in SQL Database Selection When choosing the right SQL database for your project, several key factors require careful consideration because of the impact they can have on the performance, scalability, and overall success of your data platform. Let's explore the critical aspects to evaluate during the database selection process. Data Model and Schema The data model and schema play a crucial role in database selection. It's essential to thoroughly understand the structure of your data, and how it will be organized within the database. Here’s what to consider: Analyze your data requirements and create a comprehensive data dictionary that defines every column of information you plan to store.Separate your data into logical tables and columns, aiming for a structure that makes sense and minimizes redundancy across tables.Plan the constraints for each table, including primary keys, foreign keys and their formats (single-column or multi-column).Choose appropriate data types for your columns, keeping in mind that columns with foreign key relationships must share the same data type as the parent column.Consider the specific requirements of your chosen database system. For example, some databases may have recommendations for time-related data types or primary key constraints. By carefully designing your data model and schema, you can ensure your chosen SQL database aligns with your project's needs and supports efficient data management. Scalability Requirements Scalability has a significant influence on database selection — after all, it determines how well your system can accommodate growth. Here’s what you’ll want to consider when evaluating scalability: Assess your project's expected growth and how well the database can handle expansion.Understand the differences between vertical and horizontal scaling. Vertical scaling involves increasing the capacity of a single server, while horizontal scaling adds more servers to the system.Evaluate the database's ability to scale horizontally, especially if you anticipate rapid growth or high-traffic workloads.Consider the trade-offs between different database types. For example, traditional relational databases may struggle with horizontal scaling, while NewSQL databases often excel in this area.Explore NewSQL databases, which aim to combine the scalability of NoSQL with the transactional consistency of relational databases.Assess the database's performance under increasing data volumes and traffic loads to ensure it can meet your scalability requirements. Performance Needs Performance has a direct impact on user experience and is a critical factor in database selection. Consider the following aspects when evaluating performance: Analyze your project's specific performance requirements, including query efficiency and the balance between read and write operations.Assess the database's ability to efficiently handle complex queries, joins, and aggregations.Consider the performance characteristics of different database types. For example, NoSQL databases may offer faster write speeds, while relational databases excel at complex queries.Evaluate the database's ability to handle high-volume write operations, especially for applications that generate constant data updates.Assess the database's support for indexing and query optimization techniques to enhance performance.Consider the impact of data volume on query performance and how well the database scales as data grows.Evaluate the database's ability to handle concurrent operations and maintain performance under heavy loads. By carefully considering these key factors — data model and schema, scalability requirements and performance needs — you can make an informed decision when selecting a SQL database. This ensures your chosen database aligns with your project's specific requirements and supports your data platform's (and your organization’s) long-term success. Popular SQL Database Options When it comes to database selection, several SQL database options stand out in the market. Each has its unique features and strengths, making them suitable for different use cases. Let's explore some of the most popular SQL database options to help you make an informed decision for your data platform. MySQL MySQL has established itself as a leading open-source relational database management system. Its popularity stems from its reliability, ease of use and scalability. MySQL has a significant impact on web applications, powering many of the world's largest websites and applications including Twitter, Facebook, Netflix and Spotify. One of MySQL's key advantages is its user-friendly nature. Getting started with MySQL is relatively straightforward, thanks to its comprehensive documentation and large community of developers. The abundance of MySQL-related resources online further supports its ease of use. MySQL was designed with a focus on speed and reliability. While it may not fully adhere to standard SQL, MySQL developers continuously work towards closer compliance. To bridge this gap, MySQL offers various SQL modes and extensions that bring it closer to standard SQL functionality. Unlike some other database systems, MySQL operates through a separate daemon process. This architecture allows for greater control over database access, enhancing security and management capabilities. PostgreSQL PostgreSQL, often referred to as Postgres, bills itself as "the most advanced open-source relational database in the world." It was created with the goal of being highly extensible and standards-compliant. PostgreSQL is an object-relational database, combining the structured data storage of relational databases with the flexibility of object-oriented models. One of PostgreSQL's standout features is its ability to handle complex data structures efficiently. It supports user-defined objects and table inheritance, making it particularly useful for applications that deal with diverse data types and complex relationships. PostgreSQL excels in handling concurrent tasks (more commonly referred to as concurrency). It achieves this without read locks thanks to its implementation of Multiversion Concurrency Control (MVCC) — which also ensures ACID compliance. In addition to supporting standard numeric, string, and date/time data types, PostgreSQL offers support for geometric shapes, network addresses, bit strings, text searches and JSON entries. This versatility makes PostgreSQL a powerful choice for a wide range of database applications. All Your SQL Needs in One Database Choosing a SQL database has a significant impact on the success of your data platform. By considering factors including data model, scalability, and performance needs, organizations can select a database that aligns with their project requirements and business goals. This thoughtful approach to database selection sets the stage for efficient data management and analysis, enabling businesses to leverage their data effectively. In the end, the right SQL database empowers organizations to handle their data needs efficiently and securely. Whether it's MySQL's user-friendly nature, PostgreSQL's advanced features, or SQL Server's integration capabilities, each option offers unique strengths. By understanding these options and matching them with specific project needs, businesses can build a strong foundation for their data-driven initiatives and stay competitive in today's data-centric world.
Vision AI models have a flaw. When shown a medical scan, they might correctly diagnose a condition while citing anatomically impossible reasons. Or they might solve a geometry problem with the right answer but skip essential theorems and rely on made-up ones instead. These models reach correct conclusions through reasoning that makes no sense. The Gap in Visual Reasoning Models This hints at a deeper problem. Current models don’t really think through visual problems — they pattern-match their way to answers. The LlamaV-o1 team discovered this by doing something simple: they forced their model to show its work. The results revealed that most visual reasoning errors don’t come from failing to see what’s in an image. They come from skipping key logical steps between seeing and concluding. This gap between seeing and reasoning matters. A model that gets the right answer through wrong reasoning is like a student who memorizes solutions without understanding the principles. It will fail unpredictably when faced with new problems. The solution turns out to require rethinking how we train these models. Today’s standard approach gives a model an image and question, then trains it to predict the correct answer. This works well enough to pass many benchmarks. But it’s like teaching a student to recognize answer patterns without understanding the underlying concepts, like training to answer physics problems by memorizing flashcards with the problem on the front and a single number, the answer, on the back. LlamaV-o1’s Training Approach LlamaV-o1, the top new vision question-answering paper on AIModels.fyi today, takes a different path. The training process is divided into two stages. In Stage 1, the model is trained simultaneously on summarization and caption generation using samples from the PixMo and Geo170K datasets. Stage 2 then builds upon this foundation to handle detailed reasoning and final answer generation using the Llava-CoT dataset. Each reasoning step must be explicit and verifiable, so the model can’t take shortcuts. This mirrors the chain of thought approach common in many language models. Figure 1 Figure 1 shows how LlamaV-o1 outperforms Gemini-1.5-Flash and Claude-3.5-Sonnet on a pattern recognition task from VRC-Bench. Claude-3.5-Sonnet picks “none of the options” but doesn’t fully match the observed logic. Gemini-1.5-Flash also shows weaker coherence. LlamaV-o1 identifies the correct option (D) by following the pattern, proving its stronger reasoning ability. There are three key technical advances in this paper that I think are worth pointing out. First is the introduction of VRC-Bench, a comprehensive benchmark specifically designed to evaluate multi-step reasoning tasks. Second, is a novel metric that assesses visual reasoning quality at the granularity of individual steps. And, third is the curriculum learning approach with beam search optimization. Figure 2 Figure 2 highlights the variety of tasks in VRC-Bench, each requiring step-by-step reasoning. The examples span geometry (calculating angles with linear pairs), chemistry (identifying ethane from its molecular structure), chart analysis (pie charts for global energy reserves), art recognition (identifying historical paintings), sports classification, medical diagnosis (classifying tissue types), and advertisement analysis (extracting product names). Each task forces the model to explain its logical steps, from understanding the prompt to arriving at the answer. Let’s start by talking about curriculum learning with explicit reasoning supervision. The model trains in stages, mastering basic visual perception before attempting complex reasoning. At each stage, it must generate specific intermediate outputs — like describing what it sees, identifying relevant elements for the current question, and explaining each logical step. The training data contains over 4,000 manually verified reasoning steps to ensure the model learns valid reasoning patterns. The second advance is an efficient implementation of beam search during inference. While generating each reasoning step, the model keeps track of multiple possible next steps rather than committing to the first one it thinks of. Most models avoid this due to computational costs. For example, Llava-CoT has linear scaling (O(n)) based on model calls. LlamaV-o1 improves upon this with a simplified beam search that achieves constant scaling (O(1)) while still exploring alternative reasoning paths effectively. These mechanisms work together — curriculum learning teaches the model how to break down problems, while beam search helps it find valid reasoning paths efficiently. The result is a model that thinks more systematically without becoming impractically slow. And then innovation three comes last — to test this approach properly, the team had to build a new kind of benchmark. Their VRC-Bench presents problems across eight domains, from basic visual tasks to complex medical diagnoses. But unlike standard benchmarks that check only final answers, VRC-Bench verifies each reasoning step. A model can’t pass by accident. Figure 3 Figure 3 shows the new Visual ReasoningChain-Bench (VRC-bench), which spans math (MathVista, LogicVista), science (Science-QA), visual tasks (Blink-IQ-Test), medical imaging (MMMU-Medical), culture (ALM-Bench), documents (Doc-VQA), and charts (Chart-VQA). The bar chart compares final answer accuracy and reasoning quality across a bunch of models. The results expose something fundamental about how AI systems learn to think. When forced to show their reasoning, most current models reveal alarming gaps where they skip necessary logical steps. LlamaV-o1 makes fewer such jumps, and when it fails, it usually fails at specific reasoning steps rather than producing mysteriously wrong conclusions. Conclusion I think this combination points to something important about the future of AI systems. Most current work focuses on making models faster or more accurate at producing answers. But I suspect that for truly complex tasks — the kind humans solve through careful reasoning — we’ll need models that can think methodically through each step. LlamaV-o1’s architecture suggests this might be possible without the huge computational costs many feared. The approach will need testing beyond visual reasoning. Safety-critical domains like medical diagnosis or engineering seem like natural next steps to me — areas where we care more about reliable reasoning than speed. I wouldn’t be surprised if the techniques pioneered here end up being more valuable for their careful reasoning capabilities than for their computer vision advances. What do you think? Let me know on Discord or in the comments. I’d love to hear what you have to say.
Introduction to RAG and Quarkus Retrieval-augmented generation (RAG) is a technique that enhances AI-generated responses by retrieving relevant information from a knowledge source. In this tutorial, we’ll build a simple RAG-powered application using Java and Quarkus (a Kubernetes-native Java framework). Perfect for Java beginners! Why Quarkus? Quarkus provides multiple LangChain4j extensions to simplify AI application development, especially RAG implementation by providing an Easy RAG module for building end-to-end RAG pipelines. Easy RAG acts as a bridge, connecting the retrieval components (like your document source) with the LLM interaction within the LangChain4j framework. Instead of manually orchestrating the retrieval, context injection, and LLM call, easy RAG handles these steps behind the scenes, reducing the amount of code you need to write. This abstraction allows you to focus on defining your data sources and crafting effective prompts, while easy RAG takes care of the more technical details of the RAG workflow. Within a Quarkus application, this means you can quickly set up a RAG endpoint by simply configuring your document source and leveraging easy RAG to retrieve and query. This tight integration with LangChain4j also means you still have access to the more advanced features of LangChain4j if you need to customize or extend your RAG pipeline beyond what easy RAG provides out of the box. Essentially, easy RAG significantly lowers the barrier to entry for building RAG applications in a Quarkus environment, allowing Java developers to rapidly prototype and deploy solutions without getting bogged down in the lower-level implementation details. It provides a convenient and efficient way to leverage the power of RAG within the already productive Quarkus and LangChain4j ecosystem. Step 1: Set Up Your Quarkus Project Create a new Quarkus project using the Maven command: Shell mvn io.quarkus:quarkus-maven-plugin:3.18.4:create \ -DprojectGroupId=com.devzone \ -DprojectArtifactId=quarkus-rag-demo \ -Dextensions='langchain4j-openai, langchain4j-easy-rag, websockets-next' This generates a project with a simple AI bot with easy RAG integration. Find the solution project here. The AI service refers to Open AI by default. You can replace it with local Ollama using the quarkus-langchain4j-ollama extension rather than quarkus-langchain4j-openai. Step 2: Explore the Generated AI Service Open the Bot.java file in the src/main/java/com/devzone folder. The code should look like this: Java @RegisterAiService // no need to declare a retrieval augmentor here, it is automatically generated and discovered public interface Bot { @SystemMessage(""" You are an AI named Bob answering questions about financial products. Your response must be polite, use the same language as the question, and be relevant to the question. When you don't know, respond that you don't know the answer and the bank will contact the customer directly. """) String chat(@UserMessage String question); } @RegisterAiService registers the AI service as an interface.@SystemMessage defines the initial instruction and scope that will be sent to the LLM as the first message.@UserMessage defines prompts (e.g., user input) and usually combines requests and expected responses’ format. You can change the definitions regarding your LLMs and prompt engineering practices. Step 3: Lean How to Integrate Easy RAG Into AI Service When the quarkus-langchain4j-easy-rag extension is added to the Quarkus project, the only steps required to ingest documents into an embedding store are to include a dependency for an embedding model and specify a single configuration property, quarkus.langchain4j.easy-rag.path, which points to a local directory containing your documents. During application startup, Quarkus automatically scans all files within the specified directory and ingests them into an in-memory embedding store, eliminating the need for manual setup or complex configuration. Open the application.properties file in the src/main/resources folder. You should find the quarkus.langchain4j.easy-rag.path=easy-rag-catalog property. Navigate to the easy-rag-catalog folder in the project root directory. You should find four documents generated with different file formats such as txt, odt, and pdf files. Shell . |____retirement-money-market.txt |____elite-money-market-account.odt |____smart-checking-account.pdf |____standard-saving-account.txt This approach significantly reduces the overhead typically associated with implementing RAG pipelines, allowing developers to focus on building their application logic rather than managing the intricacies of document ingestion and embedding storage. By leveraging the quarkus-langchain4j-easy-rag extension, developers can quickly enable their applications to retrieve and utilize relevant information from documents, enhancing the capabilities of AI-driven features such as chatbots, question-answering systems, or intelligent search functionalities. The extension’s seamless integration with Quarkus ensures a smooth development experience, aligning with Quarkus’s philosophy of making advanced technologies accessible and easy to use in cloud-native environments. Step 4: Test Your Application Using Quarkus Dev Mode Before testing the AI application, you need to set your OPEN_API_KEY in the application.properties file: quarkus.langchain4j.openai.api-key=YOUR_OPENAI_API_KEY Start the Quarkus dev mode to test the AI application using the following Maven command: ./mvnw quarkus:dev The output should look like this: Shell Listening for transport dt_socket at address: 55962 __ ____ __ _____ ___ __ ____ ______ --/ __ \/ / / / _ | / _ \/ //_/ / / / __/ -/ /_/ / /_/ / __ |/ , _/ ,< / /_/ /\ \ --\___\_\____/_/ |_/_/|_/_/|_|\____/___/ INFO [io.qua.lan.eas.run.EasyRagRecorder] (Quarkus Main Thread) Reading embeddings from /Users/danieloh/Downloads/quarkus-rag-demo/easy-rag-embeddings.json INFO [io.quarkus] (Quarkus Main Thread) quarkus-rag-demo 1.0.0-SNAPSHOT on JVM (powered by Quarkus 3.18.4) started in 2.338s. Listening on: http://localhost:8080 INFO [io.quarkus] (Quarkus Main Thread) Profile dev activated. Live Coding activated. INFO [io.quarkus] (Quarkus Main Thread) Installed features: [awt, cdi, langchain4j, langchain4j-easy-rag, langchain4j-openai, langchain4j-websockets-next, poi, qute, rest-client, rest-client-jackson, smallrye-context-propagation, smallrye-openapi, swagger-ui, vertx, websockets-next] -- Tests paused Press [e] to edit command line args (currently ''), [r] to resume testing, [o] Toggle test output, [:] for the terminal, [h] for more options> To access the Quarkus Dev UI, press “D” on the terminal where the Quarkus dev mode is running or access http://localhost:8080/q/dev-ui/ directly on a web browser. Select “Chat” to access the experimental prompt page. This is beneficial for developers to verify a new AI service quickly without implementing REST APIs or front-end applications. Input (prompt) the following text to verify the RAG functionality: Tell me about the benefits of a "Standard savings account." Send the prompt to the Open AI. The AI model should be the GPT-4o mini by default. The prompt will be improved by ingesting the document (e.g., standard-saving-account.txt) before the user input message is sent to the LLM. A few seconds after your request is processed, the response will be sent back to you with the following answer: Enhancements for Real-World Use Use a vector database. Replace the in-memory list with Qdrant or Pinecone for scalable document retrieval.Add AI models. Integrate Hugging Face transformers for advanced text generation.Error handling. Improve robustness with retry logic and input validation. Conclusion You’ve built a basic RAG application with Java and Quarkus! This example lays the groundwork for smarter apps that combine retrieval and generation. Experiment with larger datasets or AI models to level up!
February 26, 2025
by
CORE
Integrating AI Agent Workflows in the SOC
February 25, 2025 by
February 25, 2025
by
CORE
Banking Fraud Prevention With DeepSeek AI and AI Explainability
February 26, 2025 by
Identity and Access Management Solution to Safeguard LLMs
February 26, 2025 by
How to Scale Elasticsearch to Solve Your Scalability Issues
February 26, 2025 by
Identity and Access Management Solution to Safeguard LLMs
February 26, 2025 by
How to Scale Elasticsearch to Solve Your Scalability Issues
February 26, 2025 by
AI-Powered Professor Rating Assistant With RAG and Pinecone
February 26, 2025 by
How to Scale Elasticsearch to Solve Your Scalability Issues
February 26, 2025 by
10 Best Practices for Managing Kubernetes at Scale
February 26, 2025 by
PostgreSQL 12 End of Life: What to Know and How to Prepare
February 26, 2025 by
A Platform-Agnostic Approach in Cloud Security
February 26, 2025 by
10 Best Practices for Managing Kubernetes at Scale
February 26, 2025 by
Banking Fraud Prevention With DeepSeek AI and AI Explainability
February 26, 2025 by
Identity and Access Management Solution to Safeguard LLMs
February 26, 2025 by
AI-Powered Professor Rating Assistant With RAG and Pinecone
February 26, 2025 by