Yann Neuhaus
Mastering SSIS with Excel – Resolving the ‘Microsoft.ACE.OLEDB.16.0 Provider is Not Registered’ Error
Dear Readers,
Recently, I encountered the ‘Microsoft.ACE.OLEDB.16.0 Provider is Not Registered’ error while working on a data extraction task using SSIS. In this article, I’ll share how to fix this issue and also guide you through:
- Migrating SSIS packages from SQL Server 2017 to SQL Server 2022 using Visual Studio 2022.
- Deploying SSIS packages to the SSIS Catalog.
- Installing 32-bit driver when 64-bit version is already present.
In a recent project, I had to migrate SSIS packages from SQL Server 2017 to SQL Server 2022. Many of these packages used Excel as a data source. Here’s the environment setup:
- Development Environment: Windows Server 2022 with Visual Studio 2022 installed.
- Production Environment: Windows Server 2022 with SQL Server 2022 and SSMS (SQL Server Management Studio) 20.2.
Migrating these packages was straightforward using Visual Studio 2022. Below are some key steps:
2.1 SSIS Catalog CreationUsing SQL Server in the production environment, I created the SSIS Catalog.
Note: The ‘Create Catalog’ option is disabled because each SQL Server instance can have only one SSIS Catalog, and the instance used in this demo already contains an existing SSIS Catalog.
2.2 Setting up Visual StudioI Installed the SQL Server Integration Services Projects 2022 extension in Visual Studio 2022.
2.3 SSIS Packages ImportUsing Visual Studio 2022’s import wizard, I imported SSIS packages from SQL Server 2017. This wizard simplifies the process by extracting the packages and creating a local project.
2.4 Deploying PackagesAfter the migration, I used the SSIS Project Deployment feature to deploy all the packages into the new SSIS Catalog.
3 Problem StatementAfter deploying the packages to SSIS Catalog, executing them resulted in the error: ‘Microsoft.ACE.OLEDB.16.0 provider is not registered’
This issue also appeared when attempting to import Excel data through SQL Server Management Studio:
Then I got the same error message
4 Root Cause AnalysisThis error occurs because the Access Connectivity Engine (ACE) OLEDB provider required to connect Excel files is not installed.
5 SolutionTo resolve this issue, I followed these steps:
5.1 Install the 64-bit Microsoft Access Database Engine 2016 RedistributableDownload Microsoft Access Database Engine 2016 Redistributable 64-bit and install it on the two servers.
5.2 Install the 32-bit Microsoft Access Database Engine 2016 RedistributableDownload Microsoft Access Database Engine 2016 Redistributable 32-bit if necessary.
This is required for 32-bit applications like SQL Server Management Studio’s Excel import feature.
The installation of the 32-bit ended up with below error message because the 64-bit was already installed:
To bypass this, use the Command Prompt to force the installation:
5.3 Install Additional Components (depending on your Excel version)Download and install other components depending on the Excel version used. Details are provided in the below picture:
6 ConclusionConnecting to Excel files using SSIS requires the ACE OLEDB provider. By installing the correct Microsoft Access Database Engine, you can fix the ‘Microsoft.ACE.OLEDB.16.0 provider is not registered’ error.
Additionally, I walked you through migrating SSIS packages from SQL Server 2017 to SQL Server 2022 , deploying them to the SSIS Catalog and force install a 32-bit driver when a 64-bit is already installed.
Stay tuned for future insights and tips to avoid similar challenges!
L’article Mastering SSIS with Excel – Resolving the ‘Microsoft.ACE.OLEDB.16.0 Provider is Not Registered’ Error est apparu en premier sur dbi Blog.
2024.pgconf.eu – Bigger than ever, the family is growing fast
The PostgreSQL community is growing every year and the same happens for pgconf.eu, which happened in Athens, Greece, this year. Attending this event is one of the best opportunities to get in touch with the people behind PostgreSQL. This year we had 778 attendees. May it be PostgreSQL hackers, users, developers, organizers or anyone else interested in PostgreSQL in some way or the other. While I am not going to give you any information about the sessions I’ve attended (you can check that for yourself on the conference website), I’d like to tell you why attending PostgreSQL conferences is such a great experience
First of all, this conference is about people. Here I meet people which I’ve not seen since last year, or even longer. But as soon as we meet again we’re one big family sharing the interest in free and open source software, especially PostgreSQL. What makes the PostgreSQL conferences such a great place to be is, that there is no marketing. Of course there are sponsors, and you may get in touch with them to check what services they offer. But nobody will directly come to you to sell you something. Of course there are competitors, but you don’t feel that because we’re once big group of people having the same goal: Making PostgreSQL the best open source database in the world, making PostgreSQL even more successful, and we all can be part of that.
Especially pgconf.eu is also a pool of local PostgreSQL user groups coming together. Usually, and this year was no exception, we have a SwissPUG dinner at the Thursday of the conference. Other local user groups do the same, and when those events are over most of the people continue discussing at either the conference hotel’s bar or somewhere else in the city. Given the size of the conference there are people from all over the word, a lot of different cultures, opinions, and ways of living. This mix is the second reason, why this is the place to be in October.
Finally it is about the content. There are sessions which deep dive into PostgreSQL internals, there are sessions around the PostgreSQL community which are not technical at all, there are sessions for beginners and for long term PostgreSQL users. There is something for everybody. But this is not all, as there are also plenty of technical discussions around the talks. You want to know something bout the JDBC driver, go and talk to the people who are doing the driver. You want to know something about PostgreSQL partitioning, go and talk to the people who either did it or are very experienced with it. There are answers for every question around PostgreSQL, you just need to either know who to talk to, or ask someone who can connect you with the right people.
Last but not least, the community is always looking for improvements:
See you all next year in? … and, by the way:
L’article 2024.pgconf.eu – Bigger than ever, the family is growing fast est apparu en premier sur dbi Blog.
PostgreSQL Conference Europe 2024 – Day 2
Here’s the following day 2 of the PostgreSQL Conference Europe 2024 in Athens, if you haven’t read the first day here is the link.
After a fantastic first day, I was excited to dive into the next day of the conference, which offered a mix of technical sessions that covered essential topics like upgrading PostgreSQL, handling statistics, and even taking DBA skills to the cloud.
Here are some of the highlights from the sessions I attended:
An Ultimate Guide to Upgrading Your PostgreSQL Installation
The day began with a thorough guide on PostgreSQL upgrades, focusing on best practices and critical steps to ensure a smooth upgrade process. One key takeaway was the need for logical replication during upgrades, emphasizing the requirement for primary keys on each table. The session underscored the importance of careful schema design, especially when aiming for minimal downtime during major version upgrades. This talk was very close to the subject I proposed and that I will blog about soon (Architecture for Seamless PostgreSQL Upgrades). The way upgrade best practices were exposed by Ilya Kosmodemiansky was very straightforward while I was taking the complicated approach explaining how Logical Replication is difficult and why you shouldn’t put yourself in the spot of having to rely on it. The presentation from Ilya helped me a lot as a speaker.
A Deep Dive into Postgres Statistics
This session took us into the inner workings of PostgreSQL’s statistics collection. Understanding how PostgreSQL gathers and utilizes statistics is fundamental for optimizing query performance, and this talk delivered just that. I enjoyed a lot the talk host Louise Grandjonc, she brought infectious enthusiasm that resonated with the audience ( the database she chose was about medical studies). Everyone could feel that she was passionate about PostgreSQL her view on statistics brought also some nice insights on the reason for some behavior by looking and the source code and doing the math yourself.
Porting On-Prem Performance Troubleshooting Skills to the Cloud
I was very interested by the subject of this talk from Denzil Ribeiro (Microsoft), I didn’t expect any big announcement other than allowing some additional Postgres feature into the Azure cloud. My mistake! This was huge for PostgreSQL Azure user and DBA seeking performance tuning tool on the cloud because it announced the first iteration for PostgreSQL Query Store ! For those who know and love this feature on SQL Server this is huge and having Microsoft porting such a feature in the PostgreSQL world is a big step for them even though it is Azure cloud-based and won’t be available in the PostgreSQL open source project… To me, this is actually quite fair on their part even, but it also means future improvements on auto_explain capabilities from Microsoft part to support this feature which is going to be so full of possibilities for performance tuners and other future open-source projects. For me, this will be good also for other on-premise products like PGanalyze. Denzil also announced future automation in query tuning advisor to complete this, using LLM-powered functionalities.
Demystifying Kubernetes for Postgres DBAs: A Guide to Operators
The final session of the day brought clarity to the complexities of managing PostgreSQL on Kubernetes. The speaker presented a guide to PostgreSQL operators in Kubernetes, explaining how operators can simplify database management in containerized environments. For those of us comfortable with on-prem or virtualized setups, it was a great introduction to how Kubernetes can support PostgreSQL with scalability and reliability. This talk helped demystify the Kubernetes landscape for DBAs, showing how operators can bridge the gap between traditional database administration and cloud-native practices.
Day two of the conference was packed with practical knowledge, from upgrading strategies to adapting cloud skills and navigating Kubernetes for PostgreSQL. Alongside the talks discussing with speakers and committers of the project, I found myself already part of this community which was and is very motivating!
Like any PostgreSQL conference, it was the occasion for me to get some new stickers and the goodies provided by the sponsors. Usually, I am not so much a fan of these things, but I couldn’t get past the conference t-shirt offered by pgconf.eu and the Microsoft Citus socks!
L’article PostgreSQL Conference Europe 2024 – Day 2 est apparu en premier sur dbi Blog.
PostgreSQL Conference Europe 2024 – Day 1
Attending PostgreSQL Conference Europe 2024 in Athens is an incredible experience, full of enriching conversations and technical deep dives. As a DBA Consultant and speaker (on the reserve list and honored to have been chosen), it was my first time meeting this amazing community in person and I was immediately struck by the camaraderie and shared passion for PostgreSQL.
The atmosphere was welcoming, with everyone — from those just starting their PostgreSQL journey to seasoned veterans — eager to exchange knowledge. I even had the pleasure of meeting people whose blogs and posts I’ve followed for years, like Laurenz Albe. These interactions made me realize how vibrant and collaborative this community is.
The choices of talks to attend is quite tricky for me as I wanted to attend all of them.
You might want to check your self the schedule to get the idea :
Schedule — PostgreSQL Conference Europe 2024
Opening & Keynote: PostgreSQL and Licensing
The conference kicked off with a keynote that touched on the licensing structure of PostgreSQL. It was an insightful reminder of how PostgreSQL remains free and open-source, empowering countless organizations worldwide to run enterprise-grade databases without the burden of licensing fees and extending on the fact that the real fee was time and effort from the contributor and the community.
Cybertec: High Concurrency & Distributed Snapshots
This session dived into Cybertec’s research on improving high-concurrency workloads by implementing distributed snapshots. The idea of using a Commit Sequence Number system, similar to what Yugabyte uses with hybrid time, was particularly intriguing. I have to admit that I didn’t get most of the talk as this was part of the internal track but it was very interesting to see efforts being made by committers and the current on going improvement in making read and write IO capabilities even better.
Laurenz Albe: Security Attacks on PostgreSQL
Laurenz’s talk on security attacks was one of my most anticipated sessions. He covered practical techniques to defend against common security vulnerabilities in PostgreSQL. His advice on setting security_barrier = ON
for views, properly configuring security definer functions, and the potential abuse of search paths was invaluable. It reinforced how critical it is to consider these security measures when architecting secure PostgreSQL solutions. Myself I have a blog pending on this since some weeks now but the talk motivated me to publish it with some new ideas.
Michael Bank: Patroni Deployment Patterns
As someone involved in high-availability setups, the session on Patroni deployment patterns by Michael Bank was a must-see. Michael explored various ways to implement Patroni in different environments, focusing on best practices and real-world examples. It provided a deeper understanding of how to fine-tune Patroni architectures in production environments, a topic that resonates with my ongoing work.
Social event
In the evening there was a social event where drinks and food were served and it was the occasion to meet even more people and speak all the languages you can…
It was the occasion for some to dress up :
The first day of the conference offered not only technical depth but also meaningful exchanges with fellow DBAs. The talks, combined with discussions, provided a perfect blend of learning and sharing.
Looking forward to what the rest of the conference holds !
L’article PostgreSQL Conference Europe 2024 – Day 1 est apparu en premier sur dbi Blog.
M-Files Regional Partner 2024 (Geneva)
Last week, I had the pleasure to be invited to the M-Files Regional Partner in Geneva.
M-Files people rallied around us:
Tero Antila – Director EMEA Partner Sales
Helinã Mellanen – Senior Enablement Manager
Emilie Feat – Senior Partner Manager Switzerland
Sebastian Wegener – Senior Partner Manager Germany & Austria
Joachim Bleicher – Senior Partner Sales Engineer
Mustapha Laouari – Sales Engineer
After a great morning coffee, few M-Files Business overview (goals and future plan strategy) and Networking, I decided to follow the different technical tracks presented by Joachim and Mustapha.
New M-Files DesktopIf you are quite familiar with M-Files, most probably, you have used the current M-Files Desktop client:
M-Files New Desktop objective is to streamline the different client interfaces such as Classic and new M-Files Web client or M-Files Mobile you are using in order to access your vault documents. Currently, those client UIs may lead to some confusion for end users and could be frustrated in terms of functionality and/or interactivity limitations. It is built on modern web technologies and have a fresh look and feel. A single code base has been used for this new M-Files Desktop and M-Files Web client.
Already available in October 2024, it should fasten access to new capabilities with its updated design, updated such as Aino tab (assistant in M-Files to help users find information), improved filter view and made easier vault navigation. No more “M:” drive used. Even if classic desktop client remain the default for now, transition should minimize disruption and users will still have the possibility to toggle to and from the new client version.
New capabilities will come in a near future:
– Create view
– Home screen
– Drag & drop to Windows
– Visitor links
… and much more
Last technical words, in order to make this operational, you will have to enable gRPC connection protocol on your M-Files server (something you do not have to take care of if using M-Files cloud solution – ready for use). As well, consider the below while operating the transition:
- UI extension applications require the new application framework, UIX Framework 2.0
- Compliance Kit compatibility: a new CK version will be released which will be compatible with the new client (including object creator and version control modules (most used modules))
Here is the current Outlook context/situation:
Today, you shall save all your important emails to one M-Files safe space: apply metadata (only relevant metadata need to be filled), save attachments and even automate filling with customizable Outlook folder rules through Outlook interface. Now it seems that more and more features will come:
Cloud updateNo doubt that M-Files Cloud is taking up more and more room: saving time to value, with less effort keeping security, compliance and high availability.
M-Files trial is now available for partners (partner program fee).
Includes cloud hosting on Standard Cloud as well as M-Files Aino with 1GB semantic index. Check for additional services as needed:
M-Files Manage and Identity ManagementQuick review regarding the concept of identity management in M-Files:
- User provisioning
- Authentication (Windows, M-Files and Federated authentication)
- M-Files Manage
As you may know, vaults users are always mapped to “Login accounts”
In Cloud, M-Files Manage is used to add users manually and Microsoft Entra ID preferred with M-Files Login Service (out of the box, easy option).
Latest M-Files Manage updates allow to copy existing template vaults directly in M-Files Cloud from 2 different subscriptions. An application account can be used to get access to M-Files.
Moreover, cloud vault deletion and mass operation (mass license/user changes/operations) are coming soon.
M-Files Aino & Copilot connectorM-Files Aino (cloud service requiring a connection to M-Files Cloud) is an AI assistant that can summarize documents and give responses about document contents. Connectivity, confidentiality and accuracy are elements of this successful generative AI solution.
Ask something and get answers in a particular context (vault or properties metadata-driven context)
… even with documents in other languages.
Aino can be used with any text based document. Behind the scene, an additional module, “Aino Usage: Semantic Index” need to be installed. Consider also carefully Aino indexer configuration settings and index storage size as directly proportional to the amount of text in (only active) documents.
M365 Copilot integration (M-Files connector for Copilot – included in Business Platform Edition but not in Business LE) is ensuring that content in M-Files is available to M365 Copilot users. It can be used independently from Aino.
With this Connector for Copilot, users can access all the content in M-Files with the same Copilot user interface. It publishes selected information from M-Files to Microsoft 365 platform so that Copilot can refer to it when doing its work (information retains the permission in M-Files, i.e, same M-Files read permission access). When clicking a referenced document, it will take the user right to the document in M-Files.
Update on Collaboration possibilities- Teams
As a reminder, external users will access Teams Channel but only internal users can see what is in M-Files through Teams (concurrent user license). Here’s the roadmap:
Some nice features, improvements to come such as Teams channel posts’ attachments store and access
- Hubshare
M-Files Hubshare is a full web application that is developed in C#, updated monthly. Additionally, a new version of the M-Files Plugin for Hubshare is published every 3 months. Below are 2024 highlight resumed.
Some objectives and improvements coming for Hubshare:
ConclusionAs you can see, there is a lot ongoing and I hope you did find, (re-)discover how M-Files ecosystem is rich and lively. If you plan to use, improve or discover them in your current and future Business Processes integration, please, do not hesitate to contact us for any support.
L’article M-Files Regional Partner 2024 (Geneva) est apparu en premier sur dbi Blog.
PostgreSQL 17: Enhancing JSON Support for Web Developers
PostgreSQL 17 introduces new features for working with JSON data. These features align PostgreSQL more closely with the SQL/JSON standard and improve the developer experience when dealing with semi-structured data. In this blog, we will explore the new JSON capabilities by walking through real-world examples using a table with a jsonb
column.
When building modern web applications, it’s common to handle JSON data—whether it’s from API responses, user data, or configuration files. PostgreSQL has supported JSON for years, and with the release of PostgreSQL 17, working with JSON becomes even more streamlined.
Let’s start by creating a table to store user data in a jsonb
column and use it to demonstrate the new features.
jsonb
Column
To demonstrate PostgreSQL 17’s new features, let’s create a simple table called users
that stores user information in a jsonb
column.
postgres=# CREATE TABLE users (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
profile JSONB
);
CREATE TABLE
This table has three columns:
id
: A unique identifier for each user.name
: The user’s name.profile
: Ajsonb
column that stores various information about the user, such as their age, preferences, and settings.
Now, we’ll insert some user data in JSON format into the profile
column.
postgres=# INSERT INTO users (name, profile)
VALUES
('John Doe', '{"age": 30, "preferences": {"newsletter": true, "theme": "dark"}}'),
('Jane Smith', '{"age": 25, "preferences": {"newsletter": false, "theme": "light"}}'),
('Alice Brown', '{"age": 35, "preferences": {"newsletter": true, "theme": "dark"}, "address": {"city": "Paris", "country": "France"}}');
INSERT 0 3
We now have three users with different JSON profiles, each containing details such as age, preferences for newsletters and themes, and, in Alice’s case, an address.
postgres=# select * from users;
id | name | profile
----+-------------+----------------------------------------------------------------------------------------------------------------------
1 | John Doe | {"age": 30, "preferences": {"theme": "dark", "newsletter": true}}
2 | Jane Smith | {"age": 25, "preferences": {"theme": "light", "newsletter": false}}
3 | Alice Brown | {"age": 35, "address": {"city": "Paris", "country": "France"}, "preferences": {"theme": "dark", "newsletter": true}}
(3 rows)
With the table set up and populated, let’s explore the new JSON-related features in PostgreSQL 17, such as JSON_TABLE
, SQL/JSON query functions, and enhanced jsonpath
expressions.
JSON_TABLE
: Converting JSON into Tabular Format
The JSON_TABLE
function allows us to transform our jsonb
data into rows and columns. This is particularly useful when we want to extract structured data from JSON documents stored in a relational database.
Let’s extract the age
and theme
from the profile
column and convert it into a tabular format:
postgres=# SELECT *
FROM JSON_TABLE(
(SELECT profile FROM users WHERE name = 'Alice Brown'),
'$.preferences' COLUMNS (theme TEXT PATH '$.theme', newsletter BOOLEAN PATH '$.newsletter')
) AS jt;
theme | newsletter
-------+------------
dark | t
(1 row)
Here, we extracted Alice Brown’s theme preference and whether she subscribes to the newsletter from her profile
. The $
symbol is essential for this operation.
Here’s a breakdown of how this works:
$.preferences
: This part refers to thepreferences
key at the root of the JSON document. Once this key is selected, it acts as the root for the columns inside thepreferences
object.- Inside the
COLUMNS
clause:$.theme
: Here,$
refers to the root of thepreferences
object, not the root of the entire JSON document. So, it looks for thetheme
key inside thepreferences
object.$.newsletter
: Similarly,$
here refers to the root of thepreferences
object, and it looks for thenewsletter
key within that object.
In this context, the $
in the COLUMNS
clause is “relative” to the object you are working with, which in this case is preferences
. This is a key concept in using the JSON_TABLE
function and jsonpath
expressions in PostgreSQL—$
adapts based on the context of the object you’re working with at that stage of the query.
PostgreSQL allows you to create JSON directly from SQL expressions, making it easier to work with JSON data dynamically. This function exists since PostgreSQL 12, but I believe that it makes sense to present it here.
Let’s construct some JSON data based on our users
table:
postgres=# SELECT name,
jsonb_build_object(
'age', profile->>'age',
'theme', profile->'preferences'->>'theme'
) AS constructed_json
FROM users;
name | constructed_json
-------------+---------------------------------
John Doe | {"age": "30", "theme": "dark"}
Jane Smith | {"age": "25", "theme": "light"}
Alice Brown | {"age": "35", "theme": "dark"}
(3 rows)
This query dynamically builds a JSON object from the age
and theme
fields in the profile
column.
PostgreSQL 17 introduces several new SQL/JSON query functions, such as JSON_EXISTS
, JSON_QUERY
, and JSON_VALUE
. These functions allow you to query and extract values from JSON documents more efficiently.
Let’s check if the address
key exists in John Doe’s profile:
postgres=# SELECT JSON_EXISTS(profile, '$.address')
FROM users
WHERE name = 'John Doe';
json_exists
-------------
f
(1 row)
We can use the JSON_VALUE
function to extract specific values from a JSON document. For example, let’s extract the city
from Alice’s address:
postgres=# SELECT JSON_VALUE(profile, '$.address.city')
FROM users
WHERE name = 'Alice Brown';
json_value
------------
Paris
(1 row)
The JSON_VALUE
function simplifies the process of extracting individual scalar values from JSON documents. Use JSON_VALUE()
only when you expect the extracted value to be a single SQL/JSON scalar. Attempting to retrieve multiple values will result in an error. If the extracted value might be an object or an array, opt for the JSON_QUERY
function instead.
The JSON_QUERY
function in PostgreSQL provides a powerful way to extract data from JSONB columns using path expressions. By leveraging its various options, developers can customize the output format, handle errors gracefully, and work efficiently with JSON data stored in PostgreSQL. Now, let’s use JSON_QUERY
to extract the preferences for each user. We want to get the preferences
object for each user’s profile.
postgres=# SELECT
name,
JSON_QUERY(profile, '$.preferences') AS user_preferences
FROM users;
name | user_preferences
-------------+-----------------------------------------
John Doe | {"theme": "dark", "newsletter": true}
Jane Smith | {"theme": "light", "newsletter": false}
Alice Brown | {"theme": "dark", "newsletter": true}
(3 rows)
jsonpath
Expressions
PostgreSQL improves its support for jsonpath
expressions, enabling more advanced queries. You can now cast JSON values into native PostgreSQL types, such as integers or booleans.
Let’s extract Jane Smith’s age
and cast it as an integer:
postgres=# SELECT JSON_VALUE(profile, '$.age' RETURNING INT) AS age_int
FROM users
WHERE name = 'Jane Smith';
age_int
---------
25
(1 row)
This query demonstrates how you can convert JSON data into a native PostgreSQL type using the RETURNING
clause.
PostgreSQL 17 brings powerful new features for working with JSON data, making it easier for web developers to query and manipulate JSON in a database. We explored how JSON_TABLE
, new SQL/JSON functions, and enhanced jsonpath
expressions help convert JSON data into a more usable format, extract values, and even handle complex queries.
These new tools make working with semi-structured data simpler, allowing you to build more efficient and flexible web applications. If you often work with JSON in PostgreSQL, upgrading to version 17 will streamline your workflow and enhance your capabilities.
Stay tuned and don’t forget to check PostgreSQL 17 Release note https://www.postgresql.org/about/news/postgresql-17-released-2936/
L’article PostgreSQL 17: Enhancing JSON Support for Web Developers est apparu en premier sur dbi Blog.
Improved not null/null planning in PostgreSQL 17
PostgreSQL 17 came with quite some additional features for the optimizer. One of them is about “is null” and “is not null” handling when column have a “not null” constraint defined. In the past it was common to just not add a “is not null” to a query when you are anyway sure that there cannot be any nulls because of a “not null” constraint. If you did it it anyway, there was some overhead because the “is not null” had to be evaluated. This changed with PostgreSQL 17 as the optimizer got smarter for such cases.
This can easily be demonstrated by creating a simple table with one column having a not null constraint:
postgres=# select version();
version
--------------------------------------------------------------------------------------------------------------
PostgreSQL 16.0 on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
(1 row)
postgres=# create table t ( a int not null );
CREATE TABLE
postgres=# insert into t select * from generate_series(1,1000000);
INSERT 0 1000000
postgres=# analyze t;
ANALYZE
… and the same in version 17:
postgres=# select version();
version
--------------------------------------------------------------------------------------------------------------
PostgreSQL 17.0 on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
(1 row)
postgres=# create table t ( a int not null );
CREATE TABLE
postgres=# insert into t select * from generate_series(1,1000000);
INSERT 0 1000000
postgres=# analyze t;
ANALYZE
If we select from that table in version 16 by asking for all the row which are not null we’ll see this:
postgres=# select version();
version
--------------------------------------------------------------------------------------------------------------
PostgreSQL 16.0 on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
(1 row)
postgres=# explain select * from t where a is not null;
QUERY PLAN
-----------------------------------------------------------
Seq Scan on t (cost=0.00..14425.00 rows=1000000 width=4)
Filter: (a IS NOT NULL)
(2 rows)
The filter (a IS NOT NULL) is evaluated even that we know it does apply to all the rows because of the not null constraint. Doing the same in version 17 looks like this:
postgres=# select version();
version
--------------------------------------------------------------------------------------------------------------
PostgreSQL 17.0 on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
(1 row)
postgres=# explain select * from t where a is not null;
QUERY PLAN
-----------------------------------------------------------
Seq Scan on t (cost=0.00..14425.00 rows=1000000 width=4)
(1 row)
The filter is gone which saves the overhead of evaluating it. The other case is asking for all rows which are null, which obviously cannot be the case because the constraint does not allow that.
PostgreSQL 16 will do a parallel sequential scan over the whole table:
postgres=# select version();
version
--------------------------------------------------------------------------------------------------------------
PostgreSQL 16.0 on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
(1 row)
postgres=# explain select * from t where a is null;
QUERY PLAN
-------------------------------------------------------------------
Gather (cost=1000.00..9591.77 rows=1 width=4)
Workers Planned: 2
-> Parallel Seq Scan on t (cost=0.00..8591.67 rows=1 width=4)
Filter: (a IS NULL)
(4 rows)
PostgreSQL 17 will not do that anymore:
postgres=# select version();
version
--------------------------------------------------------------------------------------------------------------
PostgreSQL 17.0 on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
(1 row)
postgres=# explain select * from t where a is null;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.00 rows=0 width=0)
One-Time Filter: false
(2 rows)
Seems to be obvious, but versions before 17 cannot do this. A really nice improvement and only this should make you using PostgreSQL 17 as soon as possible.
L’article Improved not null/null planning in PostgreSQL 17 est apparu en premier sur dbi Blog.
Creating a Global and Custom Snackbar in Vue 3 Using Vuetify and Pinia
In this post, we’ll walk through how to implement a Global and Custom Snackbar in Vue 3 with Vuetify and Pinia for state management. We’ll set it up so that you can easily trigger the Snackbar from any component in your app.
I. Set Up Your Vue App with Vuetify CLITo start, we’ll create our Vue 3 app using the Vuetify CLI. This CLI simplifies the process of integrating Vuetify and sets up your project structure.
npm create vuetify@latest
Follow the prompts to choose your project settings and install dependencies. As we are going to use Pinia, don’t forget to choose this option. I decided to keep all the generated files to keep it simple and straightforward, but you can clean the code that was automatically generated. I just cleaned the component HelloWorld.vue, that now looks like this:
<template>
<v-container class="fill-height">
<v-responsive class="align-centerfill-height mx-auto" max-width="900">
</v-responsive>
</v-container>
</template>
<script setup lang="ts">
</script>
Next, create a Pinia store (SnackbarStore.ts
) to handle the state of the Snackbar (open/close, message, status, etc.). This will allow you to trigger the Snackbar from anywhere in the app.
// SnackbarStore.ts inside stores directory
import { defineStore } from 'pinia'
import { Ref, ref } from 'vue'
export interface CustomAction {
actionName: string;
link: any;
}
export const useSnackbarStore = defineStore('SnackbarStore', () => {
const isOpen = ref(false)
const message = ref('')
const status = ref('error')
const customActions: Ref<CustomAction[]> = ref([])
const showSnackbar = (msg: string, st: string, optionalActions?: CustomAction[]) => { // Change to accept an array
message.value = msg
status.value = st
isOpen.value = true
if (optionalActions) {
customActions.value.push(...optionalActions) // Spread the array into customActions
}
if (status.value.toLowerCase() === 'success') {
setTimeout(() => {
closeSnackbar()
}, 2500)
}
}
function closeSnackbar () {
isOpen.value = false
customActions.value.splice(0, customActions.value.length) // Clear custom actions
}
return { isOpen, message, status, showSnackbar, closeSnackbar, customActions }
})
We’ll create a reusable Snackbar component (GlobalSnackbar.vue
) that can be used anywhere in your application.
<template>
<div class="text-center">
<v-snackbar
v-model="snackbarRef"
class="customSnackbar"
:color="snackbarStatus"
variant="outlined"
vertical
>
<div class="text-subtitle-1 pb-2 title-custom">{{ snackbarStatus }}</div>
<p>{{ errorMsg }}</p>
<template #actions>
<v-btn
v-for="act of snackbarActions"
:key="act.actionName"
:to="act.link"
>
{{ act.actionName }}
</v-btn>
<v-btn variant="tonal" @click="closeSnackbar"> Close </v-btn>
</template>
</v-snackbar>
</div>
</template>
<script lang="ts" setup>
import { toRef } from 'vue'
import { CustomAction, useSnackbarStore } from '@/stores/SnackbarStore'
// Define the properties that the component will receive
const props = defineProps({
snackbarShow: Boolean,
errorMsg: String,
snackbarStatus: String,
snackbarActions: {
type: Array as () => CustomAction[], // Custom type to define snackbar actions
},
})
const snackbar = useSnackbarStore()
const snackbarRef = toRef(props, 'snackbarShow')
const closeSnackbar = () => {
snackbar.closeSnackbar()
}
</script>
<style>
.customSnackbar.v-snackbar--vertical .v-snackbar__wrapper {
background-color: rgb(238, 238, 238);
}
.title-custom {
text-transform: uppercase !important;
}
</style>
App.vue
to Include the Global Snackbar
To ensure the Snackbar is available across your entire app, import it into App.vue
and pass the required props from the SnackbarStore
.
<template>
<v-app>
<v-main>
<router-view />
</v-main>
<GlobalSnackbar
:error-msg="snackbar.message"
:snackbar-actions="snackbar.customActions"
:snackbar-show="snackbar.isOpen"
:snackbar-status="snackbar.status"
/>
</v-app>
</template>
<script lang="ts" setup>
import { useSnackbarStore } from '@/stores/SnackbarStore'
const snackbar = useSnackbarStore()
</script>
This step is essential to making sure the GlobalSnackbar
component is part of the root layout and is always available to display messages.
Now, let’s demonstrate how to trigger the Snackbar from any component. For this example, we’ll use a button that, when clicked, shows a success message with an optional action. We can edit HelloWord.vue with following code.
<template>
<v-container class="fill-height">
<v-responsive class="align-centerfill-height mx-auto" max-width="900">
<v-btn @click="openSnackbar()">test</v-btn>
</v-responsive>
</v-container>
</template>
<script setup lang="ts">
import { useSnackbarStore } from '@/stores/SnackbarStore'
const snackbar = useSnackbarStore()
const openSnackbar = () => {
snackbar.showSnackbar('Show snackbar', 'success', [
{ actionName: 'Go to Home', link: '/home' },
{ actionName: 'Go to Test', link: '/test' },
])
}
</script>
Once everything is in place, try clicking the button to trigger the Snackbar. You should see the Snackbar with a success message and an optional action button appear at the bottom of the screen. And Logically, if you click on one of the custom action buttons, it will redirect you properly, but you will obviously see a blank page, as the view doesn’t exist.
ConclusionIn this post, we created a global and custom Snackbar component in Vue 3 using Vuetify and Pinia. We set it up so that it can be easily triggered from anywhere in the app. By managing its state with Pinia, we keep the logic centralized and clean, making the Snackbar reusable across the entire application.
Feel free to adjust the styling and functionality of the Snackbar to suit your needs. Now you have a flexible, easy-to-use notification system for your Vue 3 project!
L’article Creating a Global and Custom Snackbar in Vue 3 Using Vuetify and Pinia est apparu en premier sur dbi Blog.
Documentum – xPlore Dsearch not starting with a stuck rebuild
I recently faced another interesting issue (c.f. this one) with Documentum xPlore where the Dsearch would refuse to start because of a stuck online rebuild. It happened on a Kubernetes environment using custom images we built (for security and scalability). The K8s Nodes were split across two DataCenters for DR reasons. However, one of the DataCenter had an important issue which caused all VMs running there to freeze and become unresponsive. Some Documentum pods were “running” on these VMs, at that point in time. While putting these K8s Nodes offline, the pods were properly rescheduled on the other DataCenter. However, post that operation, one specific Dsearch pod couldn’t start anymore. This Documentum environment has two xPlore Federations, each containing one Dsearch (with embedded local CPS), one IndexAgent and two CPS-Only pods (so four xPlore pods per Federation). All pods were up&running properly, except for that particular Dsearch.
The startup logs of the Dsearch were as follow:
[xplore@ds1-0 ~]$ cdlogPrimaryDsearch
[xplore@ds1-0 logs]$ ls -ltr
total 108
-rw-r----- 1 xplore xplore 0 Oct 14 07:20 rest.log
-rw-r----- 1 xplore xplore 139 Oct 14 07:20 dsearchadminweb.log
-rw-r----- 1 xplore xplore 37722 Oct 14 07:21 cps_daemon.log
-rw-r----- 1 xplore xplore 1353 Oct 14 07:21 cps.log
-rw-r----- 1 xplore xplore 3218 Oct 14 07:21 xdb.log
-rw-r----- 1 xplore xplore 7445 Oct 14 07:21 dfc.log
-rw-r----- 1 xplore xplore 39287 Oct 14 07:21 dsearch.log
[xplore@ds1-0 logs]$
[xplore@ds1-0 logs]$ grep ERROR *
dfc.log:2024-10-14 07:21:18,279 ERROR [default task-1] io.undertow.servlet.request - UT015002: Stopping servlet IndexServerServlet due to permanent unavailability
dsearch.log:2024-10-14 07:21:09,858 ERROR [ServerService Thread Pool -- 76] c.e.d.c.f.i.core.collection.ESSCollection - Failed to create index on library REPO_NAME/dsearch/Data/col1.
dsearch.log:2024-10-14 07:21:09,860 ERROR [ServerService Thread Pool -- 76] c.e.d.core.fulltext.webapp.IndexServerServlet - Failed to start
dsearch.log:2024-10-14 07:21:18,278 ERROR [default task-1] c.e.d.core.fulltext.webapp.IndexServerServlet - Failed to start
dsearch.log:2024-10-14 07:21:45,963 ERROR [Index-Rebuilder-col1-0-Worker-0] c.e.d.c.f.i.core.index.xhive.ESSXMLNodeHandler - Failed to handle 090f123480096bde
dsearch.log:2024-10-14 07:21:45,966 ERROR [Index-Rebuilder-col1-0-Worker-0] c.e.d.c.f.i.core.index.xhive.ESSXMLNodeHandler - Failed to handle 090f123480096bde
dsearch.log:2024-10-14 07:21:45,968 ERROR [Index-Rebuilder-col1-0-Worker-0] c.e.d.c.f.i.core.index.xhive.ESSXMLNodeHandler - Failed to handle 090f123480096bde
dsearch.log:2024-10-14 07:21:45,968 ERROR [Index-Rebuilder-col1-0-Worker-0] c.e.d.c.f.i.core.index.xhive.ESSXMLNodeHandler - Failed to handle 090f123480096bde
dsearch.log:2024-10-14 07:21:45,970 ERROR [Index-Rebuilder-col1-0-Worker-0] c.e.d.c.f.i.core.index.xhive.ESSXMLNodeHandler - Failed to handle 090f123480096bde
[xplore@ds1-0 logs]$
[xplore@ds1-0 logs]$ cat dsearch.log
2024-10-14 07:20:59,035 INFO [ServerService Thread Pool -- 76] c.e.d.core.fulltext.indexserver.core.ESSNode - Starting xPlore
2024-10-14 07:20:59,044 INFO [ServerService Thread Pool -- 76] c.e.d.core.fulltext.indexserver.core.ESSContext - Initializing xPlore instance
2024-10-14 07:20:59,062 INFO [ServerService Thread Pool -- 76] c.e.d.c.f.i.engine.xhive.impl.XhiveManager - Initializing xDB federation and database
2024-10-14 07:21:00,265 INFO [ServerService Thread Pool -- 76] c.e.d.c.f.i.engine.xhive.impl.XhiveSchemaHandler - Not loading configuration from file system, revision 1.29 equal to xDB revision of indexserverconfig.xml
2024-10-14 07:21:00,725 INFO [ServerService Thread Pool -- 76] c.e.d.c.f.indexserver.core.ESSActiveNodesRegistry - Register instance PrimaryDsearch
2024-10-14 07:21:00,726 INFO [ServerService Thread Pool -- 76] c.e.d.c.f.i.engine.xhive.impl.XhiveManager - Starting xDB for instance primary
2024-10-14 07:21:00,967 INFO [ServerService Thread Pool -- 76] c.e.d.c.f.i.engine.xhive.impl.XhiveManager - Initialize xDB driver to local driver with cache pages: 524288 and Socket timeout :1000 ms
2024-10-14 07:21:01,153 INFO [ServerService Thread Pool -- 76] c.e.d.c.f.i.engine.xhive.impl.XhiveManager - XDB is listening at 9330 successfully
2024-10-14 07:21:01,253 INFO [ServerService Thread Pool -- 76] c.e.d.c.f.i.engine.xhive.impl.XhiveManager - Initialize xDB driver to remote driver with cache pages: 524288 and Socket timeout :1000 ms
2024-10-14 07:21:01,253 INFO [ServerService Thread Pool -- 76] c.e.d.c.f.i.engine.xhive.impl.XhiveManager - The XML database server is started successfully. {xDB version=xDB 10_7@4558357}
2024-10-14 07:21:01,261 DEBUG [ServerService Thread Pool -- 76] c.e.d.core.fulltext.indexserver.cps.CPSSubmitter - Begin to connect to CPS at (local) with connection 3
2024-10-14 07:21:08,559 INFO [ServerService Thread Pool -- 76] c.e.d.core.fulltext.indexserver.cps.CPSSubmitter - Connected to a local Content Processing Service with version [20.2.0000.0015].
2024-10-14 07:21:08,630 DEBUG [ServerService Thread Pool -- 76] c.e.d.core.fulltext.indexserver.cps.CPSSubmitter - Begin to connect to CPS at (https://cps1-0.cps1.dctm-ns1-name.svc.cluster.local:9302/cps/ContentProcessingService?wsdl) with connection 3
2024-10-14 07:21:09,077 INFO [ServerService Thread Pool -- 76] c.e.d.core.fulltext.indexserver.cps.CPSSubmitter - Connected to a remote Content Processing Service [https://cps1-0.cps1.dctm-ns1-name.svc.cluster.local:9302/cps/ContentProcessingService?wsdl] with version [20.2.0000.0015].
2024-10-14 07:21:09,244 DEBUG [ServerService Thread Pool -- 76] c.e.d.core.fulltext.indexserver.cps.CPSSubmitter - Begin to connect to CPS at (https://cps1-1.cps1.dctm-ns1-name.svc.cluster.local:9302/cps/ContentProcessingService?wsdl) with connection 3
2024-10-14 07:21:09,437 INFO [ServerService Thread Pool -- 76] c.e.d.core.fulltext.indexserver.cps.CPSSubmitter - Connected to a remote Content Processing Service [https://cps1-1.cps1.dctm-ns1-name.svc.cluster.local:9302/cps/ContentProcessingService?wsdl] with version [20.2.0000.0015].
2024-10-14 07:21:09,626 INFO [ServerService Thread Pool -- 76] c.e.d.core.fulltext.indexserver.core.ESSContext - Initialize domain SystemData
2024-10-14 07:21:09,706 INFO [ServerService Thread Pool -- 76] c.e.d.core.fulltext.indexserver.core.ESSContext - Initialize domain REPO_NAME
2024-10-14 07:21:09,858 ERROR [ServerService Thread Pool -- 76] c.e.d.c.f.i.core.collection.ESSCollection - Failed to create index on library REPO_NAME/dsearch/Data/col1.
java.lang.NullPointerException: null
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.collection.ReindexStatus.reset(ReindexStatus.java:68)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.collection.ESSCollection.addLMPIReindexTaskIfNecessary(ESSCollection.java:2766)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.collection.ESSCollection.createIndexes(ESSCollection.java:2824)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.collection.ESSCollection.initIndex(ESSCollection.java:2665)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.collection.ESSCollection.init(ESSCollection.java:2347)
...
2024-10-14 07:21:09,860 ERROR [ServerService Thread Pool -- 76] c.e.d.core.fulltext.webapp.IndexServerServlet - Failed to start
com.emc.documentum.core.fulltext.common.exception.IndexServerRuntimeException: com.emc.documentum.core.fulltext.common.exception.IndexServerException: Failed to create index on library REPO_NAME/dsearch/Data/col1.
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.ESSContext.initialize(ESSContext.java:261)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.ESSNode.startUp(ESSNode.java:67)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.webapp.IndexServerServlet.init(IndexServerServlet.java:48)
at io.undertow.servlet@2.0.21.Final//io.undertow.servlet.core.LifecyleInterceptorInvocation.proceed(LifecyleInterceptorInvocation.java:117)
...
Caused by: com.emc.documentum.core.fulltext.common.exception.IndexServerException: Failed to create index on library REPO_NAME/dsearch/Data/col1.
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.collection.ESSCollection.createIndexes(ESSCollection.java:2864)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.collection.ESSCollection.initIndex(ESSCollection.java:2665)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.collection.ESSCollection.init(ESSCollection.java:2347)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.collection.ESSCollection.<init>(ESSCollection.java:145)
...
Caused by: java.lang.NullPointerException: null
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.collection.ReindexStatus.reset(ReindexStatus.java:68)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.collection.ESSCollection.addLMPIReindexTaskIfNecessary(ESSCollection.java:2766)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.collection.ESSCollection.createIndexes(ESSCollection.java:2824)
... 34 common frames omitted
2024-10-14 07:21:09,861 INFO [Index-Rebuilder-col1-0] c.e.d.c.f.i.core.collection.ESSCollection - Rebuilding index [dmftdoc] for collection [col1].
2024-10-14 07:21:09,863 INFO [Index-Rebuilder-col1-0] c.e.d.c.f.i.core.collection.ESSCollection - Rebuild index dmftdoc with non-blocking mode.
2024-10-14 07:21:18,256 INFO [default task-1] c.e.d.core.fulltext.indexserver.core.ESSNode - Starting xPlore
2024-10-14 07:21:18,256 INFO [default task-1] c.e.d.core.fulltext.indexserver.core.ESSContext - Initializing xPlore instance
2024-10-14 07:21:18,276 INFO [default task-1] c.e.d.c.f.indexserver.core.ESSActiveNodesRegistry - Register instance PrimaryDsearch
2024-10-14 07:21:18,276 INFO [default task-1] c.e.d.c.f.i.engine.xhive.impl.XhiveManager - Starting xDB for instance primary
2024-10-14 07:21:18,278 ERROR [default task-1] c.e.d.core.fulltext.webapp.IndexServerServlet - Failed to start
com.emc.documentum.core.fulltext.common.exception.IndexServerRuntimeException: com.emc.documentum.core.fulltext.common.exception.EngineException: Failed to start xDB socket listener
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.ESSContext.initialize(ESSContext.java:261)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.ESSNode.startUp(ESSNode.java:67)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.webapp.IndexServerServlet.init(IndexServerServlet.java:48)
at io.undertow.servlet@2.0.21.Final//io.undertow.servlet.core.LifecyleInterceptorInvocation.proceed(LifecyleInterceptorInvocation.java:117)
...
Caused by: com.emc.documentum.core.fulltext.common.exception.EngineException: Failed to start xDB socket listener
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.engine.xhive.impl.XhiveManager.startDatabase(XhiveManager.java:646)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.ESSContext.startDatabase(ESSContext.java:187)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.ESSContext.initialize(ESSContext.java:237)
... 49 common frames omitted
Caused by: java.net.BindException: Address already in use (Bind failed)
at java.base/java.net.PlainSocketImpl.socketBind(Native Method)
...
2024-10-14 07:21:45,963 ERROR [Index-Rebuilder-col1-0-Worker-0] c.e.d.c.f.i.core.index.xhive.ESSXMLNodeHandler - Failed to handle 090f123480096bde
java.lang.NullPointerException: null
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.index.ESSIndexHelper.getObjectCategoryConfig(ESSIndexHelper.java:97)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.index.ESSIndexHelper.shouldCheckObjectVersion(ESSIndexHelper.java:115)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.index.FtIndexObject.<init>(FtIndexObject.java:71)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.index.xhive.ESSXMLNodeHandler.callCPSPlugin(ESSXMLNodeHandler.java:379)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.index.xhive.ESSXMLNodeHandler.enter(ESSXMLNodeHandler.java:123)
at deployment.dsearch.war//com.xhive.xDB_10_7_r4558357.ch$b.c(xdb:617)
...
2024-10-14 07:21:45,966 ERROR [Index-Rebuilder-col1-0-Worker-0] c.e.d.c.f.i.core.index.xhive.ESSXMLNodeHandler - Failed to handle 090f123480096bde
java.lang.NullPointerException: null
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.index.xhive.ESSXMLNodeHandler.prepareDomDocument(ESSXMLNodeHandler.java:304)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.index.xhive.ESSXMLNodeHandler.callCPSPlugin(ESSXMLNodeHandler.java:350)
...
2024-10-14 07:21:45,968 ERROR [Index-Rebuilder-col1-0-Worker-0] c.e.d.c.f.i.core.index.xhive.ESSXMLNodeHandler - Failed to handle 090f123480096bde
java.lang.NullPointerException: null
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.index.xhive.ESSXMLNodeHandler.prepareDomDocument(ESSXMLNodeHandler.java:304)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.index.xhive.ESSXMLNodeHandler.callCPSPlugin(ESSXMLNodeHandler.java:350)
...
2024-10-14 07:21:45,968 ERROR [Index-Rebuilder-col1-0-Worker-0] c.e.d.c.f.i.core.index.xhive.ESSXMLNodeHandler - Failed to handle 090f123480096bde
java.lang.NullPointerException: null
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.index.xhive.ESSXMLNodeHandler.prepareDomDocument(ESSXMLNodeHandler.java:304)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.index.xhive.ESSXMLNodeHandler.callCPSPlugin(ESSXMLNodeHandler.java:350)
...
2024-10-14 07:21:45,970 ERROR [Index-Rebuilder-col1-0-Worker-0] c.e.d.c.f.i.core.index.xhive.ESSXMLNodeHandler - Failed to handle 090f123480096bde
java.lang.NullPointerException: null
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.index.ESSIndexHelper.getObjectCategoryConfig(ESSIndexHelper.java:97)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.index.ESSIndexHelper.shouldCheckObjectVersion(ESSIndexHelper.java:115)
...
2024-10-14 07:21:46,427 WARN [Index-Rebuilder-col1-0] c.e.d.c.f.i.core.collection.FtReindexTask - Reindex for index col1.dmftdoc failed
com.emc.documentum.core.fulltext.common.exception.IndexServerException: java.lang.NullPointerException
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.collection.ESSCollection.recreatePathIndexNB(ESSCollection.java:3391)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.collection.ESSCollection.reindexNB(ESSCollection.java:1360)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.collection.ESSCollection.reindex(ESSCollection.java:1249)
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.collection.FtReindexTask.run(FtReindexTask.java:204)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.NullPointerException: null
at deployment.dsearch.war//com.emc.documentum.core.fulltext.indexserver.core.index.xhive.ESSXMLNodeHandler.getTokens(ESSXMLNodeHandler.java:163)
at deployment.dsearch.war//com.xhive.xDB_10_7_r4558357.ch$b.hasNext(xdb:473)
at deployment.dsearch.war//com.xhive.core.index.PathValueIndexModifier.a(xdb:328)
...
[xplore@ds1-0 logs]$
In terms of processes, it looked “OK”, as all expected Dsearch processes were present, including the (local) CPS Indexing & Querying ones. However, the K8s liveness (a custom one I created long ago) was showing an issue with “/dsearch” not responding, and the pod was restarting continuously:
[xplore@ds1-0 logs]$ ps uxf
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
xplore 3015817 0.0 0.0 13920 3220 pts/1 S 07:20 0:00 /bin/sh ./startPrimaryDsearch.sh
xplore 3015819 0.0 0.0 13940 3276 pts/1 S 07:20 0:00 \_ /bin/sh /app/xPlore/wildfly17.0.1/bin/standalone.sh
xplore 3015920 52.9 1.5 10265428 2030744 pts/1 Sl 07:20 1:14 \_ /app/xPlore/java64/JAVA_LINK/bin/java -D[Standalone] -server -Xms8g -Xmx8g -XX:MaxMetaspaceSize=512m -XX:+UseG1GC -XX:+UseStringDeduplicati
xplore 3016359 0.1 0.0 670524 59008 pts/1 Sl 07:21 0:00 \_ /app/xPlore/dsearch/cps/cps_daemon/bin/CPSDaemon /app/xPlore/dsearch/cps/cps_daemon/PrimaryDsearch_local_configuration.xml Daemon0 9322
xplore 3016412 0.1 0.0 670460 59868 pts/1 Sl 07:21 0:00 \_ /app/xPlore/dsearch/cps/cps_daemon/bin/CPSDaemon /app/xPlore/dsearch/cps/cps_daemon/PrimaryDsearch_local_configuration.xml Daemon1 9323
xplore 3016486 0.1 0.0 670652 61400 pts/1 Sl 07:21 0:00 \_ /app/xPlore/dsearch/cps/cps_daemon/bin/CPSDaemon /app/xPlore/dsearch/cps/cps_daemon/PrimaryDsearch_local_configuration.xml Daemon2 9324
xplore 3016539 0.1 0.0 670396 58836 pts/1 Sl 07:21 0:00 \_ /app/xPlore/dsearch/cps/cps_daemon/bin/CPSDaemon /app/xPlore/dsearch/cps/cps_daemon/PrimaryDsearch_local_configuration.xml Daemon3 9325
xplore 3016660 0.1 0.0 670588 63384 pts/1 Sl 07:21 0:00 \_ /app/xPlore/dsearch/cps/cps_daemon/bin/CPSDaemon /app/xPlore/dsearch/cps/cps_daemon/PrimaryDsearch_local_configuration.xml Daemon4 9326
xplore 3016714 0.1 0.0 670524 61196 pts/1 Sl 07:21 0:00 \_ /app/xPlore/dsearch/cps/cps_daemon/bin/CPSDaemon /app/xPlore/dsearch/cps/cps_daemon/PrimaryDsearch_local_configuration.xml QDaemon0 932
xplore 3012770 0.0 0.0 14064 3668 pts/1 Ss 07:14 0:00 bash -l
xplore 3017548 0.0 0.0 53832 3960 pts/1 R+ 07:22 0:00 \_ ps uxf
xplore 3011206 0.0 0.0 14064 3656 pts/2 Ss+ 07:10 0:00 bash -l
xplore 2988030 0.0 0.0 14064 3720 pts/0 Ss+ 06:21 0:00 bash -l
xplore 2981707 0.0 0.0 13932 3464 ? Ss 06:15 0:00 /bin/bash /scripts/dbi_entrypoint.sh
[xplore@ds1-0 logs]$
[xplore@ds1-0 logs]$ /scripts/dbi_ft_liveness.sh
INFO - FT liveness exit code is '1' -- Unexpected http response from component 'PrimaryDsearch' for url 'https://xxx:9302/dsearch'...
[xplore@ds1-0 logs]$
Same thing when trying to stop xPlore to begin the investigations, it shows a problem connecting to the Dsearch:
[xplore@ds1-0 logs]$ $STARTSTOP stop
**
** The PrimaryDsearch is running with PID: 3015920
**
INFO - Stopping the PrimaryDsearch...
Instance {PrimaryDsearch} is about to shut down, wait for shutdown complete message.
Exception in thread "main" java.lang.IllegalArgumentException: Fail to connect remote server (https://ds1-0.ds1.dctm-ns1-name.svc.cluster.local:9302)
at com.emc.documentum.core.fulltext.client.admin.cli.DSearchAdminScript.getAdminService(DSearchAdminScript.java:86)
at com.emc.documentum.core.fulltext.client.admin.cli.DSearchAdminScript.stopNode(DSearchAdminScript.java:187)
...
Caused by: com.emc.documentum.core.fulltext.common.admin.DSearchAdminException: [ERROR] ServiceException thrown out:
com.emc.documentum.core.fulltext.indexserver.admin.controller.ServiceException: Exception happened while connect to instance: PrimaryDsearch
at com.emc.documentum.core.fulltext.client.admin.api.impl.ESSAdminSOAPClient.<init>(ESSAdminSOAPClient.java:110)
at com.emc.documentum.core.fulltext.client.admin.api.impl.ESSAdminServiceImpl.<init>(ESSAdminServiceImpl.java:62)
...
Caused by: com.emc.documentum.core.fulltext.indexserver.admin.controller.ServiceException: Exception happened while connect to instance: PrimaryDsearch
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
...
{
"outcome" => "success",
"result" => undefined
}
[xplore@ds1-0 logs]$
I had no idea what was happening in this environment in the few days prior to the DataCenter issue as the day-to-day support was handled by another team. However, based on the logs, it looked like it could be linked to an online rebuild. Therefore, I went and checked on the recent changes done to the indexserverconfig.xml file and I found that:
[xplore@ds1-0 logs]$ cd $CONFIG_HOME
[xplore@ds1-0 config]$
[xplore@ds1-0 config]$ ll -tr indexserverconfig.xml*
-rw-r----- 1 xplore xplore 33388 Jul 3 2021 indexserverconfig.xml.bakHttp
-rw-r----- 1 xplore xplore 33389 Jul 3 2021 indexserverconfig.xml.patch.bak.custom-v1
-rw-r----- 1 xplore xplore 34628 Oct 8 07:34 indexserverconfig.xml.patch.bak.custom-v2
-rw-r----- 1 xplore xplore 36696 Oct 10 01:11 indexserverconfig.xml
[xplore@ds1-0 config]$
[xplore@ds1-0 config]$ diff indexserverconfig.xml indexserverconfig.xml.patch.bak.custom-v2
2c2
< <index-server-configuration enable-lemmatization="true" config-check-interval="60000" revision="1.29">
---
> <index-server-configuration enable-lemmatization="true" config-check-interval="60000" revision="1.22">
159,163d158
< <sub-path path="dmftmetadata//a_status" type="string" enumerate-repeating-elements="false" full-text-search="true" value-comparison="true" returning-contents="true" include-descendants="false" description="Used by REPO_NAME to compute facets." boost-value="1.0" compress="true" leading-wildcard="false" sortable="false" include-start-end-token-flags="true"/>
< <sub-path path="dmftmetadata//doc_business_unit" type="string" enumerate-repeating-elements="false" full-text-search="true" value-comparison="true" returning-contents="true" include-descendants="false" description="Used by REPO_NAME to compute facets." boost-value="1.0" compress="true" leading-wildcard="false" sortable="false" include-start-end-token-flags="true"/>
< <sub-path path="dmftmetadata//site_unit" type="string" enumerate-repeating-elements="false" full-text-search="true" value-comparison="true" returning-contents="true" include-descendants="false" description="Used by REPO_NAME to compute facets." boost-value="1.0" compress="true" leading-wildcard="false" sortable="false" include-start-end-token-flags="true"/>
< <sub-path path="dmftmetadata//doc_responsible_author" type="string" enumerate-repeating-elements="false" full-text-search="true" value-comparison="true" returning-contents="true" include-descendants="false" description="Used by REPO_NAME to compute facets." boost-value="1.0" compress="true" leading-wildcard="false" sortable="false" include-start-end-token-flags="true"/>
< <sub-path path="dmftmetadata//material_number" type="string" enumerate-repeating-elements="false" full-text-search="true" value-comparison="true" returning-contents="true" include-descendants="false" description="Used by REPO_NAME to compute facets." boost-value="1.0" compress="true" leading-wildcard="false" sortable="false" include-start-end-token-flags="true"/>
373,377c368
< <collection usage="Data" document-category="dftxml" name="col1">
< <properties>
< <property value="dmftdoc_9640f" name="Build_dmftdoc"/>
< </properties>
< </collection>
---
> <collection usage="Data" document-category="dftxml" name="col1"/>
[xplore@ds1-0 config]$
[xplore@ds1-0 config]$ date
Mon Oct 14 07:23:34 UTC 2024
[xplore@ds1-0 config]$
Based on the above, it looked evident that five new facets were added by someone on the 8-Oct and for these facets to appear on Documentum Search, you would then need to perform an online rebuild. The last section/difference is related to that online rebuild. Whenever you start an online rebuild, the collection definition is changed in such way, to include a new property named “Build_dmftdoc“. And when it’s completed, it gets removed. However, this is a fairly small index, so an online rebuild should be done in half a day, at most. Above, it appears as if the online rebuild that has been started on the 10-Oct was still in progress, which is definitively not normal.
I have no evidence of it, but I can only assume that someone started an online rebuild on the 10-Oct for the “col1” collection and before it could complete, the DataCenter issue started, which rendered the Dsearch pod (or one of the CPS-Only pods) frozen and in an inconsistent/corrupted state. Because of it, there would be a different status for the document 090f123480096bde in the index / xDB and that would be preventing the Dsearch to start properly as we see on the logs.
To fix this issue, I tried to cancel/remove the online rebuild by modifying the indexserverconfig.xml file manually. For that purpose, I simply incremented the revision number and restored the collection “col1” definition to what it should be without a running online rebuild (so removing the 3 properties lines):
[xplore@ds1-0 config]$ cp -p indexserverconfig.xml indexserverconfig.xml.with_issue
[xplore@ds1-0 config]$
[xplore@ds1-0 config]$ vi indexserverconfig.xml ### Corrected the file here
[xplore@ds1-0 config]$
[xplore@ds1-0 config]$ cp -p indexserverconfig.xml indexserverconfig.xml.with_modification
[xplore@ds1-0 config]$
[xplore@ds1-0 config]$ diff indexserverconfig.xml.with_issue indexserverconfig.xml.with_modification
2c2
< <index-server-configuration enable-lemmatization="true" config-check-interval="60000" revision="1.29">
---
> <index-server-configuration enable-lemmatization="true" config-check-interval="60000" revision="1.30">
373,377c373
< <collection usage="Data" document-category="dftxml" name="col1">
< <properties>
< <property value="dmftdoc_9640f" name="Build_dmftdoc"/>
< </properties>
< </collection>
---
> <collection usage="Data" document-category="dftxml" name="col1"/>
[xplore@ds1-0 config]$
With this new revision “1.30“, I started the Dsearch again and this time it was able to start successfully:
[xplore@ds1-0 config]$
[xplore@ds1-0 config]$ $STARTSTOP start
**
** The PrimaryDsearch is shutdown
**
INFO - Starting the PrimaryDsearch...
**
** The PrimaryDsearch is running with PID: 53180
**
[xplore@ds1-0 config]$
[xplore@ds1-0 config]$ cdlogPrimaryDsearch
[xplore@ds1-0 logs]$ cat dsearch.log
2024-10-14 07:30:24,118 INFO [ServerService Thread Pool -- 80] c.e.d.core.fulltext.indexserver.core.ESSNode - Starting xPlore
2024-10-14 07:30:24,126 INFO [ServerService Thread Pool -- 80] c.e.d.core.fulltext.indexserver.core.ESSContext - Initializing xPlore instance
2024-10-14 07:30:24,143 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.i.engine.xhive.impl.XhiveManager - Initializing xDB federation and database
2024-10-14 07:30:25,376 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.i.engine.xhive.impl.XhiveSchemaHandler - File system configuration (revision 1.30) is higher than xDB revision (1.29). Updating indexserverconfig.xml in xDB.
2024-10-14 07:30:25,931 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.indexserver.core.ESSActiveNodesRegistry - Register instance PrimaryDsearch
2024-10-14 07:30:25,932 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.i.engine.xhive.impl.XhiveManager - Starting xDB for instance primary
2024-10-14 07:30:26,123 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.i.engine.xhive.impl.XhiveManager - Initialize xDB driver to local driver with cache pages: 524288 and Socket timeout :1000 ms
2024-10-14 07:30:26,395 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.i.engine.xhive.impl.XhiveManager - XDB is listening at 9330 successfully
2024-10-14 07:30:26,504 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.i.engine.xhive.impl.XhiveManager - Initialize xDB driver to remote driver with cache pages: 524288 and Socket timeout :1000 ms
2024-10-14 07:30:26,504 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.i.engine.xhive.impl.XhiveManager - The XML database server is started successfully. {xDB version=xDB 10_7@4558357}
2024-10-14 07:30:26,514 DEBUG [ServerService Thread Pool -- 80] c.e.d.core.fulltext.indexserver.cps.CPSSubmitter - Begin to connect to CPS at (local) with connection 3
2024-10-14 07:30:33,925 INFO [ServerService Thread Pool -- 80] c.e.d.core.fulltext.indexserver.cps.CPSSubmitter - Connected to a local Content Processing Service with version [20.2.0000.0015].
2024-10-14 07:30:33,986 DEBUG [ServerService Thread Pool -- 80] c.e.d.core.fulltext.indexserver.cps.CPSSubmitter - Begin to connect to CPS at (https://cps1-0.cps1.dctm-ns1-name.svc.cluster.local:9302/cps/ContentProcessingService?wsdl) with connection 3
2024-10-14 07:30:34,447 INFO [ServerService Thread Pool -- 80] c.e.d.core.fulltext.indexserver.cps.CPSSubmitter - Connected to a remote Content Processing Service [https://cps1-0.cps1.dctm-ns1-name.svc.cluster.local:9302/cps/ContentProcessingService?wsdl] with version [20.2.0000.0015].
2024-10-14 07:30:34,636 DEBUG [ServerService Thread Pool -- 80] c.e.d.core.fulltext.indexserver.cps.CPSSubmitter - Begin to connect to CPS at (https://cps1-1.cps1.dctm-ns1-name.svc.cluster.local:9302/cps/ContentProcessingService?wsdl) with connection 3
2024-10-14 07:30:34,838 INFO [ServerService Thread Pool -- 80] c.e.d.core.fulltext.indexserver.cps.CPSSubmitter - Connected to a remote Content Processing Service [https://cps1-1.cps1.dctm-ns1-name.svc.cluster.local:9302/cps/ContentProcessingService?wsdl] with version [20.2.0000.0015].
2024-10-14 07:30:35,030 INFO [ServerService Thread Pool -- 80] c.e.d.core.fulltext.indexserver.core.ESSContext - Initialize domain SystemData
2024-10-14 07:30:35,128 INFO [ServerService Thread Pool -- 80] c.e.d.core.fulltext.indexserver.core.ESSContext - Initialize domain REPO_NAME
2024-10-14 07:30:35,387 INFO [ServerService Thread Pool -- 80] c.e.d.core.fulltext.indexserver.core.ESSContext - Starting Audit Service
2024-10-14 07:30:35,391 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.i.services.audit.impl.FtAuditService - Audit records purge task will launch at: 2024-10-15T00:00:00+00:00
2024-10-14 07:30:35,392 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.indexserver.services.FtBaseService - Auditing service started
2024-10-14 07:30:35,392 INFO [ServerService Thread Pool -- 80] c.e.d.core.fulltext.indexserver.core.ESSContext - Starting System Metrics Service
2024-10-14 07:30:35,396 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.indexserver.services.FtBaseService - SystemMetrics service started
2024-10-14 07:30:35,396 INFO [ServerService Thread Pool -- 80] c.e.d.core.fulltext.indexserver.core.ESSContext - Starting Group Cache Service
2024-10-14 07:30:35,397 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.i.s.groupcache.impl.FtGroupCacheService - Enable incremental group cache update
2024-10-14 07:30:35,397 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.indexserver.services.FtBaseService - GroupCache service started
2024-10-14 07:30:35,399 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.i.services.security.GlobalACLCache - Use global ACL cache, user count=10, clear delay time=0
2024-10-14 07:30:35,400 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.i.services.security.GlobalACECache - Use global ACE cache, ACE cache size=1000000
2024-10-14 07:30:35,400 INFO [ServerService Thread Pool -- 80] c.e.d.core.fulltext.indexserver.core.ESSContext - Starting Auto Warmup Service
2024-10-14 07:30:35,410 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.indexserver.services.FtBaseService - Warmup service started
2024-10-14 07:30:35,417 INFO [ServerService Thread Pool -- 80] c.e.d.core.fulltext.indexserver.core.ESSContext - The xPlore instance PrimaryDsearch initialized
2024-10-14 07:30:35,425 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.i.core.index.thread.WorkerThreadPool - Spawn a new thread CPSWorkerThread-1
2024-10-14 07:30:35,426 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.i.core.index.thread.WorkerThreadPool - Spawn a new thread CPSWorkerThread-2
2024-10-14 07:30:35,427 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.i.core.index.thread.WorkerThreadPool - Spawn a new thread IndexWorkerThread-1
2024-10-14 07:30:35,427 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.i.core.index.thread.WorkerThreadPool - Spawn a new thread IndexWorkerThread-2
2024-10-14 07:30:35,438 INFO [ServerService Thread Pool -- 80] c.e.d.core.fulltext.indexserver.core.ESSNode - The xPlore instance PrimaryDsearch started the indexing threads.
2024-10-14 07:30:35,451 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.i.search.threads.SearchThreadPool - Spawn 4 search threads
2024-10-14 07:30:36,445 INFO [ServerService Thread Pool -- 80] c.e.d.c.f.i.admin.controller.jmx.ESSAdminJMXPlugin - Admin agent started.
2024-10-14 07:30:36,445 INFO [ServerService Thread Pool -- 80] c.e.d.core.fulltext.indexserver.core.ESSNode - The xPlore instance PrimaryDsearch started. {version=20.2.0000.0015}
[xplore@ds1-0 logs]$
To close this topic, a new online rebuild was started, to make sure the facets are properly showing-up, since I wasn’t sure what was the exact status of other collections. Once it was done, the Dsearch pod could restart again without problems, so the issue was fully fixed.
L’article Documentum – xPlore Dsearch not starting with a stuck rebuild est apparu en premier sur dbi Blog.
PostgreSQL 18: Tweaking relation statistics
Up to now, there is no official way in PostgreSQL to tweak optimizer statistics manually. While this is quite common in other relational databases, PostgreSQL did not provide anything to do this. The first steps to allow this have now been committed for PostgreSQL 18.
Before look at the two new functions which have been added, let’s create a small test table and an index:
postgres=# create table t ( a int, b text );
CREATE TABLE
postgres=# insert into t values (1,'aa');
INSERT 0 1
postgres=# insert into t select i, 'bb' from generate_series(2,100) i;
INSERT 0 99
postgres=# analyze t;
ANALYZE
postgres=# create index i on t(b);
CREATE INDEX
postgres=# \d t
Table "public.t"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
b | text | | |
Indexes:
"i" btree (b)
The number of pages and the number of rows are currently reported as this:
postgres=# select relpages,reltuples from pg_class where relname = 't';
relpages | reltuples
----------+-----------
1 | 100
(1 row)
Asking for the explain plan for a simple select over the table restricting on ”aa’ for column b will give a sequential scan:
postgres=# explain select * from t where b = 'aa';
QUERY PLAN
-------------------------------------------------
Seq Scan on t (cost=0.00..2.25 rows=1 width=7)
Filter: (b = 'aa'::text)
(2 rows)
As the table is consisting of only one block, it is still faster to read the whole table than to use the index, even if there is only one row matching our criteria. One of the two new function introduced by the commit linked above is pg_set_relation_stats:
postgres=# \x
Expanded display is on.
postgres=# \df pg_set_relation_stats
List of functions
-[ RECORD 1 ]-------+------------------------------------------------------------------------------------------------------------------------------------------
Schema | pg_catalog
Name | pg_set_relation_stats
Result data type | boolean
Argument data types | relation regclass, relpages integer DEFAULT NULL::integer, reltuples real DEFAULT NULL::real, relallvisible integer DEFAULT NULL::integer
Type | func
As you can easily see, this function takes a relation name, the number of pages, the number of rows and the all visible flag as input parameters. This allows us to tweak the statistics so that the optimizer will go for an index scan instead of the sequential scan:
postgres=# select * from pg_set_relation_stats('t'::regclass, 1, 1000000 );
pg_set_relation_stats
-----------------------
t
(1 row)
postgres=# \x
Expanded display is off.
postgres=# select relpages,reltuples from pg_class where relname = 't';
relpages | reltuples
----------+-----------
1 | 1e+06
(1 row)
postgres=# explain select * from t where b = 'aa';
QUERY PLAN
-----------------------------------------------------------------
Index Scan using i on t (cost=0.17..183.18 rows=10000 width=7)
Index Cond: (b = 'aa'::text)
(2 rows)
postgres=#
Nice, now we can debug optimizer decisions by modifying these statistics. Just be aware that statistics we set using this function are kind of temporary. Any manual or automatic analyze will overwrite them:
postgres=# analyze t;
ANALYZE
postgres=# select relpages,reltuples from pg_class where relname = 't';
relpages | reltuples
----------+-----------
1 | 100
(1 row)
postgres=# explain select * from t where b = 'aa';
QUERY PLAN
-------------------------------------------------
Seq Scan on t (cost=0.00..2.25 rows=1 width=7)
Filter: (b = 'aa'::text)
(2 rows)
The second function provided can be used to reset the statistics as they would be when the table gets created:
postgres=# select * from pg_clear_relation_stats('t'::regclass);
pg_clear_relation_stats
-------------------------
t
(1 row)
postgres=# select relpages,reltuples from pg_class where relname = 't';
relpages | reltuples
----------+-----------
0 | -1
(1 row)
L’article PostgreSQL 18: Tweaking relation statistics est apparu en premier sur dbi Blog.
Migrating AWX from one Kubernetes cluster to another: A custom approach
In this guide, we’ll walk through migrating an AWX instance from one Kubernetes infrastructure to another, with two important considerations. First, both AWX instances are on completely different networks, meaning there’s no direct connectivity between them. Second, we aim to replicate the credentials (including passwords) stored in AWX, which requires careful handling. This approach differs from the official documentation due to these two specific constraints.
Step 1: Backup AWX on the old infrastructureTo back up AWX on the old infrastructure, we’ll use the AWXBackup
resource provided by the AWX Operator. This will capture all necessary configurations, including credentials, job templates, and database data.
- Create the AWXBackup resource
apiVersion: awx.ansible.com/v1beta1
kind: AWXBackup
metadata:
name: awx-backup
namespace: <namespace-awx>
spec:
deployment_name: <awx-instance-name>
- Apply the backup configuration
kubectl apply -f awxbackup.yaml
- Verify the backup
Check the status of the AWXBackup resource to ensure the backup is complete
kubectl get awxbackup -n <namespace-awx>
- Access the backup data
AWXBackup creates a PVC to store the backup data. We need to retrieve it.
kubectl get pvc -n <namespace-awx>
- Mount the backup PVC
Create a temporary pod to access the backup files
apiVersion: v1
kind: Pod
metadata:
name: awx-backup-access
namespace: <namespace-awx>
spec:
containers:
- name: backup-container
image: busybox:latest
command: ["/bin/sh", "-c", "sleep 3600"]
volumeMounts:
- mountPath: /backup-data
name: awx-backup-pvc
volumes:
- name: awx-backup-pvc
persistentVolumeClaim:
claimName: <awx-backup-pvc>
- Compress the backup
Once inside the pod, go to the backup directory and archive the latest directory
kubectl exec -it awx-backup-access -n <namespace-awx> -- /bin/sh
cd /backup-data
ls -l
## Find the latest directory
tar -czvf /awx_backup.tar.gz <latest-directory>
- Copy the archive locally
Usekubectl cp
to copy the archive from the pod to your local machine
kubectl cp <namespace-awx>/awx-backup-access:/backup-data/awx_backup.tar.gz ./awx_backup.tar.gz
- Clean up the temporary pod
Once the backup is copied, delete the temporary pod
kubectl delete pod awx-backup-access -n <namespace-awx>
- Recover the decryption key for secret keys
kubectl get secrets -n <namespace-awx> <awx-instance-name>-secret-key -o jsonpath='{.data.secret_key}' && echo;
Save the base 64 encrypted key, we will need it for during the restoring step.
Step 2: Setup the new AWX instanceOn the new infrastructure, we first need to install AWX via the AWX Operator.
- Install the AWX Operator
For this step follow the official documentation of AWX Operator
Maybe you will need to deploy AWX using a local repository, you can read my other article: Deploy awx-operator with Helm using images from a local registry
- Verify the AWX deployment
Check that the new AWX instance is up and running
kubectl get awx -n <new-namespace-awx>
Next, we need to create an AWXBackup on the new infrastructure.
- Create an AWXBackup for the backup data
Create theawxbackup.yaml
file:
apiVersion: awx.ansible.com/v1beta1
kind: AWXBackup
metadata:
name: awx-backup-migration
namespace: <namespace-awx>
spec:
deployment_name: <awx-instance-name>
- Apply the backup configuration
kubectl apply -f awxbackup.yaml
- Verify the backup
Check the status of the AWXBackup resource to ensure the backup is complete
kubectl get awxbackup -n <namespace-awx>
- Identify the backup PVC
kubectl get pvc -n <namespace-awx>
Now that the new AWX is set up, we’ll transfer the backup data and restore it.
- Transfer the backup archive
Copy theawx_backup.tar.gz
file to the new infrastructure by uploading it to the new backup PVC using a temporary pod
- Create a temporary pod to restore data
Create the awx-restore-access.yaml file
apiVersion: v1
kind: Pod
metadata:
name: awx-backup-restore
namespace: <new-namespace-awx>
spec:
containers:
- name: restore-container
image: busybox:latest
command: ["/bin/sh", "-c", "sleep 3600"]
volumeMounts:
- mountPath: /backup-data
name: awx-backup-pvc
volumes:
- name: awx-backup-pvc
persistentVolumeClaim:
claimName: <pvc-for-awx-backup>
kubectl apply -f awx-restore-access.yaml
- Use
kubectl cp
to upload the archive to the pod
kubectl cp ./awx-backup-migration.tar.gz <namespace-awx>/awx-restore-access:/awx_backup.tar.gz
- Replace the data from archive
Inside the pod, extract the archive
kubectl exec -it awx-backup-restore -n <new-namespace-awx> -- /bin/sh
cd /backup-data
ls -l
## Find the latest directory
rm -rf /backup-data/<latest-backup-directory>/{tower.db,awx-objects}
cd
tar -xzvf /awx_backup.tar.gz
cp backup-data/<backup-directory-from-tar.gz>/tower.db /backup-data/<latest-backup-directory>/.
cp backup-data/<backup-directory-from-tar.gz>/awx-objects /backup-data/<latest-backup-directory>/.
vi /backup-data/<latest-backup-directory>/secret.yml
## Replace the value of the variable
## secrets:
## secretKeySecret:
## data: {secret_key: ###insert here the base64 of the decryption key recover at the end of the Step 1### }
- Create the AWXRestore resource
Create anAWXRestore
resource to apply the backup
apiVersion: awx.ansible.com/v1beta1
kind: AWXRestore
metadata:
name: awx-backup-restore
namespace: <new-namespace-awx>
spec:
deployment_name: <awx-instance-name>
backup_name: awx-backup-migration
no_log: false
force_drop_db: true
- Apply the AWXRestore
kubectl apply -f awxrestore.yaml
- Monitor the restoration
Ensure the restoration completes successfully
kubectl get awxrestore -n <new-namespace-awx>
- You can log in as admin (the password will be that of the old infrastructure)
- Explore the various AWX resources and check that everything has been migrated correctly
- Run a template job to validate correct operation
By following this procedure, we’ve successfully migrated an AWX instance across two isolated Kubernetes clusters while maintaining full fidelity of the AWX credentials and configurations.
L’article Migrating AWX from one Kubernetes cluster to another: A custom approach est apparu en premier sur dbi Blog.
Mastering Custom Integration Runtimes in Azure Data Factory: An Introduction to Our Upcoming Series
Dear Readers,
This post serves as an introduction to a six-part series focused on the Integration Runtime (IR) within Azure Data Factory (ADF). Over the course of these six articles, you’ll gain essential knowledge and practical skills to choose and implement the right Integration Runtime for your specific requirements. Each article will explore different aspects of IR, providing you with a comprehensive guide to handling various integration scenarios effectively.
What is Azure Data Factory (a.k.a. ADF)
Azure Data Factory is a cloud-based data integration service that enables the creation and scheduling of data-driven workflows. Its primary strength lies in its capability to interact with multiple types of data sources and orchestrate data flows between them.
What is Integration Runtime?
The Integration Runtime (IR) is the compute infrastructure that ADF uses to provide data integration capabilities across various network environments. There are three types of integration runtimes offered by Data Factory:
- Azure Integration Runtime: Managed by Microsoft and can only access data sources in public networks.
- Self-hosted Integration Runtime: Deployed and managed by you, allowing access to both public and private networks. You’re responsible for patching, scaling, and maintaining the infrastructure.
- Azure-SSIS Integration Runtime: A Microsoft-managed IR designed to run SQL Server Integration Services (SSIS) packages in ADF. This IR can access both public and private networks.
In this series, we will focus on the custom IR, namely Self-hosted IR and Azure-SSIS IR, which offer greater flexibility, such as the ability to interact with private networks and, in the case of SSIS IR, the execution of SSIS packages.
Context and Case StudiesTo guide you through the process of understanding the provided solutions, we’ll explore the following fictive scenarios:
Case Study 1: LucyCheck Ltd.
A Medical analysis laboratory where files are stored locally on a PC connected to the Internet. The medical regulatory authority, which operates on Azure and uses Azure Data Factory for its data integration processes, needs to retrieve specific data from the laboratory’s local system. This data will then be integrated into the regulatory authority’s Data Warehouse hosted on Azure.
How should this integration be implemented?
Case Study 2: Helory Swags Ltd
A luxury children’s clothing retailer is in the process of migrating its infrastructure to Azure. According to company policy, all new development must now take place within Azure.
The management requires reporting that involves data from a Line of Business (LOB) system, which is still stored in an on-premises SQL Server and won’t be migrated until next year.
How can an integration solution be implemented to integrate the on-premises SQL Server in the Azure Data Warehouse?
Case Study 3: Moera LinkedTeen Ltd.
A communications company with a hybrid infrastructure and several ADF instances. To fulfill various requests, these ADF instances need to extract data from the same on-premises SQL Server instance.
What is the best way to implement this integration?
Case Study 4: Aelynn Money Transfer Ltd.
A financial services company is migrating its entire infrastructure to Azure, except for one Line of Business (LOB) system, which will remain on-premises. The company’s data warehouse is populated with data from this LOB using a complex SSIS package. Due to time constraints, this SSIS package cannot be easily redeveloped in Azure Data Factory (ADF).
How can we ensure the continued use of this complex SSIS package in Azure while maintaining its connection to the on-premises LOB data?
Upcoming Solutions in the SeriesThe upcoming articles in the series will cover the following topics:
- Part 1: Security prerequisites for ADF access to on-premises environment.
- Part 2: ADF access to on-premises file systems.
- Part 3: ADF access to on-premises SQL Servers.
- Part 4: Implementing a shared ADF Integration Runtime.
- Part 5: Point-to-site (P2S) VPN setup.
- Part 6: Using SSIS IR in ADF to connect with on-premises SQL Servers.
By the end of these series, you will have a comprehensive understanding of both Self-hosted and Azure-SSIS Integration Runtimes in Azure Data Factory. Armed with this knowledge, you will confidently choose and implement the most suitable Integration Runtime for your organization’s needs.
Stay tuned for the next instalment of this exciting journey!
L’article Mastering Custom Integration Runtimes in Azure Data Factory: An Introduction to Our Upcoming Series est apparu en premier sur dbi Blog.
PostgreSQL 18: REJECT_LIMIT for COPY
With the just released version of 17 of PostgreSQL, copy got the “ON_ERROR” switch. This is very useful if you load something which contains rows that cannot be inserted into the target table. Before that, copy would just stop and error out. The only option was to fix the data and then copy again. Using on “ON_ERROR”, copy can be told to continue loading and ignore those rows with bad data.
The default behavior is still to stop as soon as there is a row/line which cannot be loaded:
postgres=# create table t ( a int, b int, c text );
CREATE TABLE
postgres=# \! cat load.txt
1,1,'aaa'
2,2,'bbb'
3,3,'ccc'
4,four,'ddd'
5,5,'eee'
6,6,'fff'
7,7,'ggg'
8,eight,'hhh'
9,9,'iii'
postgres=# copy t from '/home/postgres/load.txt' with ( delimiter ',');
ERROR: invalid input syntax for type integer: "four"
CONTEXT: COPY t, line 4, column b: "four"
Starting with PostgreSQL 17, we can tell copy to ignore those rows and continue to load the data:
postgres=# copy t from '/home/postgres/load.txt' with ( delimiter ',', on_error ignore);
COPY 7
postgres=# select * from t;
a | b | c
---+---+-------
1 | 1 | 'aaa'
2 | 2 | 'bbb'
3 | 3 | 'ccc'
5 | 5 | 'eee'
6 | 6 | 'fff'
7 | 7 | 'ggg'
9 | 9 | 'iii'
(7 rows)
What we cannot do up to PostgreSQL 17 is to set a limit for bad data. This will be possible starting with PostgreSQL 18, as copy got a new switch: “REJECT_LIMIT”.
Using this, we can tell copy to stop as soon as we reach n-rows which cannot be loaded:
postgres=# truncate t;
TRUNCATE TABLE
postgres=# copy t from '/home/postgres/load.txt' with ( delimiter ',', on_error ignore, reject_limit 1);
ERROR: skipped more than REJECT_LIMIT (1) rows due to data type incompatibility
CONTEXT: COPY t, line 8, column b: "eight"
postgres=# copy t from '/home/postgres/load.txt' with ( delimiter ',', on_error ignore, reject_limit 2);
ERROR: skipped more than REJECT_LIMIT (2) rows due to data type incompatibility
CONTEXT: COPY t, line 10, column a: ""
postgres=# copy t from '/home/postgres/load.txt' with ( delimiter ',', on_error ignore, reject_limit 3);
COPY 7
postgres=# select * from t;
a | b | c
---+---+-------
1 | 1 | 'aaa'
2 | 2 | 'bbb'
3 | 3 | 'ccc'
5 | 5 | 'eee'
6 | 6 | 'fff'
7 | 7 | 'ggg'
9 | 9 | 'iii'
(7 rows)
Nice, this gives more control about how to handle data which does not fit into into the target table.
L’article PostgreSQL 18: REJECT_LIMIT for COPY est apparu en premier sur dbi Blog.
Understanding Head Blockers in SQL-Server
A customer of mine had a recurring issue with one of his business critical applications where everything was frozen and stuck for the users. The issue was caused due to a head blocking transaction which increased blocking in the database exponentially. Depending on your technical background, you may know that locking is a very important mechanism in relational database systems to ensure transactional isolation, thus fulfilling the “I” in the theoretical ACID concept. However, when locks cause extensive blocking, it can lead to performance issues.
But what is now a head blocker?A head blocker is a transaction that is either very slow or never commits, holding locks on database resources. Other transactions that must access the resources locked by the head blocker are also holding locks while waiting for those resources.
On the next level, there are more transactions that need access to resources locked by the transactions blocked by the head blocker, and they too are holding locks. This process continues on multiple levels, creating a “lock chain” where all the transactions are waiting for the first transaction (the head blocker) to commit or roll back.
For the non-technical blog reader, let me give you an example from a daily life situation:
You are in the supermarket and you see a long queue behind the checkout. At the front of the queue you can see person A who is talking with the cashier about the weather, the next neighborhood event and about the cat of the neighbor. This person is the head blocker. Behind that person is person B and person C waiting. This is the beginning of the lock chain and the checkout is the database resource on which the head blocker is holding a lock.
To better understand the complexity of the lock mechanism in relational database systems with this metaphor imagine that only one person is allowed to have a specific item in their shopping cart at any given time in the supermarket.
Person C is waiting for person A at the checkout and has a liter of milk in their shopping cart. Person D, who already has bread and butter in their shopping cart, but wants some milk too, must wait until person C can pay and leave the supermarket. Person E and F, who also already have some items in their shopping cart but still have butter or bread on their shopping list, must wait too. The whole process continues until everyone in the supermarket is waiting for person A to complete the story about the neighbor’s cat.
In the case of my customer, the IT-staff there terminated the head blocking transaction manually as far as they recognized it.
The major problem on this approach was, that the IT-staff remarked the problem through user feedback. So way too late.
Therefore I implemented a custom developed alert solution for the case when blocking is increasing within the applications database. This enabled the IT-staff to terminate the head blocking transaction before the application users were disturbed.
How to determine blocking in a database?SQL-Server offers some very useful dynamic management views (DMV) to query information’s about blocking. For example the DMV “sys.dm_exec_requests”. When you query this DMV and you are interested in blocking, the most interesting column is the “blocking_session_id” column. This column shows you which session is currently blocking the particular request. If the request isn’t blocked, there will be a “0”.
There are also other pragmatic ways to gather information about currently running processes and their blocking. One very user-friendly method that provides a wealth of useful information is the stored procedure sp_whoisactive, developed by Adam Mechanic. This stored procedure displays currently running processes in SQL Server in a user-friendly manner and can be managed through multiple input parameters. It also returns a blocking_session_id column and many other useful details, such as the SQL text, the number of reads and writes, CPU time, and more. The stored procedure is not available by default in SQL Server but can be downloaded and installed from GitHub.
Another valuable solution is the stored procedure sp_HumanEventsBlockViewer, created by Erik Darling. This procedure is used in conjunction with an extended event session, which collects data from the Blocked Process Report. The stored procedure reads from the extended event session and presents the information in a very useful format, including valuable details such as the SQL text, the isolation level of the transactions, the wait time, and more.
I used th DMV “sys.dm_exec_requests” in the stored procedure that I developed to monitor blocking in the customer’s database.
Please note that I’m just sharing some key snippets of the stored procedure and not the entire procedure, as sharing the whole one would go beyond the planed scope of this blog post.
The stored procedure queries first of all the DMV with the “count” aggregation function to get the total number of blocked requests:
After that the stored procedure checks if the amount of blocked requests exceeds the defined sensitivity threshold:
If this is true, the stored procedure starts with the collection of the information’s which we were interested in. The first information is the session that is holding the most locks in the database. Therefore I wrote the query below which is storing the particular session ID in a variable:
After that, the stored procedure additionally gathers the number of blocked requests by this session and stores it in a second variable:
After that I used the session ID from the session, which is blocking the most requests, as a starting point to loop the lock chain up through the DMV:
After that I formatted everything in HTML and sent the information’s through the database mail. This resulted in the mail below which warned the recipients about the exceeding of the blocking threshold and provided some valuable information’s:
Summary:This blog post presents an approach to monitor blocking within a SQL-Server database. Locking is an important mechanism in SQL-Server and other relational database systems to ensure transactional isolation and it is usual that locking leads to blocking in an acceptable extent. However, extensive blocking can slow down the performance, which is considered a problem. There are multiple of potential root causes of blocking problems, including:
- Poor query performance, leading to long running transactions
- Lock escalation
- High concurrency on particular database resources
- Bad transaction design
- High transaction isolation level
If you are facing similar problems or if you have any questions don’t hesitate to share it in the comment section!
L’article Understanding Head Blockers in SQL-Server est apparu en premier sur dbi Blog.
Enhancing PostgreSQL Performance with Index Advisor and HypoPG
In database management, query performance is critical for efficient data retrieval. PostgreSQL, a powerful open-source database, offers various extensions to optimize and tune query performance. Two such extensions are Index Advisor and HypoPG. These tools help database administrators (DBAs) identify and test potential indexes to improve query performance without the overhead of creating and dropping indexes repeatedly. This blog will guide you through the installation and usage of these extensions to enhance your PostgreSQL performance.
1. Installing Index Advisor and HypoPG ExtensionsBefore diving into the usage of these extensions, let’s start with the installation process.
1.1. PostgreSQL Index AdvisorIndex Advisor is a PostgreSQL extension that recommends indexes based on query analysis. One nice point is that the extension code is still maintained. index_advisor requires Postgres with hypopg installed. Here’s how to install it:
- Installation:
--- Install hypopg
postgres@server> sudo apt install postgresql-XY-hypopg
--- Install index_adviser
postgres@server> git clone https://github.com/supabase/index_advisor.git
postgres@server> cd index_advisor
postgres@server> sudo make install
- Enable the Extension in PostgreSQL:
postgres# CREATE EXTENSION hypopg;
postgres# create extension if not exists index_advisor cascade;
Now that the extensions are installed, let’s explore how to use them to analyze and optimize query performance.
2. Using Index Advisor to Recommend IndexesIndex Advisor analyzes your queries and suggests indexes that can improve performance. Let’s see it in action with a sample query.
2.1. Example QueryConsider the following query that retrieves data from dbi_technologies
, dbi_techres
, and dbi_techlevels
tables:
SELECT tech.id techid, tech.label techlabel, reslevel.techlevel techlevel, reslevel.techlvlid
FROM dbi_technologies tech
LEFT JOIN
(SELECT techlvl.label techlevel, techlvl.id techlvlid, techres.tech_id tech_id
FROM dbi_techres techres,
dbi_techlevels techlvl
WHERE techres.res_id = $2
AND techlvl.id = techres.techlevel_id) AS reslevel
ON tech.id = reslevel.tech_id
WHERE tech.cat_id = $1
ORDER BY techlabel;
To get index recommendations for the above query, use the following command:
SELECT * FROM index_advisor('YOUR QUERY HERE');
-[ RECORD 1 ]-------------------------
startup_cost_before | 70.97
startup_cost_after | 66.76
total_cost_before | 71.00
total_cost_after | 66.79
index_statements | ['CREATE INDEX ON dbicc.dbi_technologies USING btree (cat_id)', 'CREATE INDEX ON dbicc.dbi_techres USING btree (res_id)']
errors | []
- startup_cost_before and startup_cost_after: These show the estimated cost until the query can return the first row of the result before and after applying the recommended indexes. For example, with a hash join, this includes the cost to build the internal table. For a sequential scan, the startup cost is always zero, as rows can be returned immediately.
- total_cost_before and total_cost_after: Estimated total cost of executing the query before and after applying the recommended indexes.
- index_statements: Suggested SQL statements to create the indexes. In this case, we got two suggestions.
- errors: Any errors encountered during the analysis (empty in this case).
HypoPG allows you to test the impact of hypothetical indexes without actually creating them. This is useful for evaluating potential indexes before committing to their creation. The hypothetical indexes created by HypoPG are stored in your connection’s private memory, not in any catalog. This means they won’t increase the size of any table or affect other connections.
3.1. Analyzing Query Performance without Hypothetical IndexesLet’s start by testing the query performance without any hypothetical indexes.
postgres# EXPLAIN SELECT tech.id techid, tech.label techlabel, reslevel.techlevel techlevel, reslevel.techlvlid
FROM dbi_technologies tech
LEFT JOIN (
SELECT techlvl.label techlevel, techlvl.id techlvlid, techres.tech_id tech_id
FROM dbi_techres techres, dbi_techlevels techlvl
WHERE techres.res_id = 60
AND techlvl.id = techres.techlevel_id
) AS reslevel
ON tech.id = reslevel.tech_id
WHERE tech.cat_id = 10
ORDER BY techlabel;
+-------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
|-------------------------------------------------------------------------------------------------------------------------|
| Sort (cost=72.10..72.13 rows=14 width=241) |
| Sort Key: tech.label |
| -> Nested Loop Left Join (cost=0.41..71.83 rows=14 width=241) |
| -> Seq Scan on dbi_technologies tech (cost=0.00..10.01 rows=14 width=19) |
| Filter: (cat_id = 10) |
| -> Nested Loop (cost=0.41..4.41 rows=1 width=226) |
| -> Index Only Scan using idx_tech_res_level on dbi_techres techres (cost=0.28..4.02 rows=1 width=8) |
| Index Cond: ((tech_id = tech.id) AND (res_id = 60)) |
| -> Index Scan using dbi_techlevels_id_pkey on dbi_techlevels techlvl (cost=0.13..0.33 rows=1 width=222) |
| Index Cond: (id = techres.techlevel_id) |
+-------------------------------------------------------------------------------------------------------------------------+
We are going to create our first hypothetical index based on the output from the index_advisor statement. Use hypopg_create_index
to create it:
SELECT * FROM hypopg_create_index('CREATE INDEX ON dbicc.dbi_technologies USING btree (cat_id)');
Run EXPLAIN
on your query to see the impact of the hypothetical index:
postgres# EXPLAIN SELECT tech.id techid, tech.label techlabel, reslevel.techlevel techlevel, reslevel.techlvlid
FROM dbi_technologies tech
LEFT JOIN
( SELECT techlvl.label techlevel, techlvl.id techlvlid, techres.tech_id tech_id
FROM dbi_techres techres,
dbi_techlevels techlvl
WHERE techres.res_id = 60
AND techlvl.id = techres.techlevel_id) AS reslevel
ON tech.id = reslevel.tech_id
WHERE tech.cat_id = 10
ORDER BY techlabel;
+-------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
|-------------------------------------------------------------------------------------------------------------------------|
| Sort (cost=71.39..71.43 rows=14 width=241) |
| Sort Key: tech.label |
| -> Nested Loop Left Join (cost=4.55..71.12 rows=14 width=241) |
| -> Bitmap Heap Scan on dbi_technologies tech (cost=4.13..9.31 rows=14 width=19) |
| Recheck Cond: (cat_id = 10) |
| -> Bitmap Index Scan on "<14454>btree_dbicc_dbi_technologies_cat_id" (cost=0.00..4.13 rows=14 width=0) |
| Index Cond: (cat_id = 10) |
| -> Nested Loop (cost=0.41..4.41 rows=1 width=226) |
| -> Index Only Scan using idx_tech_res_level on dbi_techres techres (cost=0.28..4.02 rows=1 width=8) |
| Index Cond: ((tech_id = tech.id) AND (res_id = 60)) |
| -> Index Scan using dbi_techlevels_id_pkey on dbi_techlevels techlvl (cost=0.13..0.33 rows=1 width=222) |
| Index Cond: (id = techres.techlevel_id) |
+-------------------------------------------------------------------------------------------------------------------------+
- Bitmap Heap Scan and Bitmap Index Scan: Show how the hypothetical index is used in scanning the table.
- Cost Reduction: Reduced costs in the query plan indicate improved performance due to the hypothetical index.
postgres# SELECT * FROM hypopg_create_index('CREATE INDEX ON dbicc.dbi_techres USING btree (res_id)');
+------------+---------------------------------------+
| indexrelid | indexname |
|------------+---------------------------------------|
| 14455 | <14455>btree_dbicc_dbi_techres_res_id |
+------------+---------------------------------------+
SELECT 1
Time: 0.013s
postgres# EXPLAIN SELECT tech.id techid, tech.label techlabel, reslevel.techlevel techlevel, reslevel.techlvlid
FROM dbi_technologies tech
LEFT JOIN
( SELECT techlvl.label techlevel, techlvl.id techlvlid, techres.tech_id tech_id
FROM dbi_techres techres,
dbi_techlevels techlvl
WHERE techres.res_id = 60
AND techlvl.id = techres.techlevel_id) AS reslevel
ON tech.id = reslevel.tech_id
WHERE tech.cat_id = 10
ORDER BY techlabel;
+---------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
|---------------------------------------------------------------------------------------------------------------------------------------|
| Sort (cost=62.47..62.51 rows=14 width=241) |
| Sort Key: tech.label |
| -> Hash Right Join (cost=10.65..62.21 rows=14 width=241) |
| Hash Cond: (techres.tech_id = tech.id) |
| -> Hash Join (cost=1.17..52.55 rows=66 width=226) |
| Hash Cond: (techres.techlevel_id = techlvl.id) |
| -> Index Scan using "<14455>btree_dbicc_dbi_techres_res_id" on dbi_techres techres (cost=0.03..51.12 rows=66 width=8) |
| Index Cond: (res_id = 60) |
| -> Hash (cost=1.06..1.06 rows=6 width=222) |
| -> Seq Scan on dbi_techlevels techlvl (cost=0.00..1.06 rows=6 width=222) |
| -> Hash (cost=9.31..9.31 rows=14 width=19) |
| -> Bitmap Heap Scan on dbi_technologies tech (cost=4.13..9.31 rows=14 width=19) |
| Recheck Cond: (cat_id = 10) |
| -> Bitmap Index Scan on "<14454>btree_dbicc_dbi_technologies_cat_id" (cost=0.00..4.13 rows=14 width=0) |
| Index Cond: (cat_id = 10) |
+---------------------------------------------------------------------------------------------------------------------------------------+
- Bitmap Heap Scan and Bitmap Index Scan: Show how the hypothetical index is used in scanning the table.
- Cost Reduction: Reduced costs in the query plan indicate improved performance due to the hypothetical index.
After the creation of our HypoPG indexes, we can see some differences in our explain statements and notice some benefits:
Reduced Total Cost:
- Without Indexes: The total cost was estimated at around 71.83.
- With HypoPG Indexes: The total cost decreased to about 62.21.
Benefit: The overall cost of executing the query is lower, indicating improved efficiency.
Improved Join Strategies:
- Without Indexes: The query used nested loop joins, which can be less efficient for large datasets.
- With HypoPG Indexes: The query plan switched to hash joins, which are generally faster for joining large sets of data.
Benefit: Hash joins are more efficient and can handle larger datasets better than nested loops.
Better Index Usage:
- Without Indexes: The plan didn’t use any specific indexes for
cat_id
orres_id
in thedbi_technologies
table. - With HypoPG Indexes: The plan utilized the hypothetical indexes for both
cat_id
andres_id
.
Benefit: The use of these indexes speeds up data retrieval by directly accessing the relevant rows rather than scanning the entire table.
Change in Access Methods:
- Without Indexes: The plan involved a sequence scan and index-only scans.
- With HypoPG Indexes: The plan used bitmap heap scans and bitmap index scans.
Benefit: Bitmap scans are efficient for handling multiple rows and are typically faster when dealing with indexed columns.
4. ConclusionIndex Advisor and HypoPG are powerful tools in the PostgreSQL ecosystem, allowing DBAs to fine-tune their database performance with minimal disruption (and without overheating your brain). By recommending and simulating indexes, these extensions help you make decisions about index creation, ensuring your queries run efficiently.
Use these tools to analyze your queries, create hypothetical indexes, and measure their impact before making any permanent changes. I hope that this blog will help you make your PostgreSQL performances better thanks to Index Advisor and HypoPG.
Happy optimizing!
Useful linkshttps://github.com/supabase/index_advisor
https://hypopg.readthedocs.io/en/rel1_stable/index.html
https://github.com/HypoPG/hypopg
Don’t mind checking my previous blog about pgcli to make your life easier while using PostgreSQL: https://www.dbi-services.com/blog/simplifying-postgresql-management-a-guide-to-install-and-use-pgcli/
L’article Enhancing PostgreSQL Performance with Index Advisor and HypoPG est apparu en premier sur dbi Blog.
OEM Cloud Control Release update to 13.5.0.23
This is upgrade procedure to release update from 07 Aug 2024. 13.5.0.23
Its also one of prerequisites for integrating OEM with OCI Exadata Cloud@Customer
solution available on my other post under this link:
Patching procedure and process is basically regular with usage of standard OMSPatcher. We updating from version 13.5.0.16 to 13.5.0.23
This update release bring some improvements in below scopes:
New Management Packs for Advanced Management and Monitoring of Engineered SystemsTo help address the key challenges in managing large distributed systems such as the Oracle Engineered Systems, RU23 introduces the Oracle Enterprise Manager Cloud Control Exadata Management Pack and Zero Data Loss Recovery Appliance (ZDLRA) Management Pack. These packs offer a comprehensive solution to optimize performance of your databases, maximize your Exadata capacity, and automate critical database backup operations.
The key features of the Exadata Management pack include:
- At scale Exadata infrastructure patching using Fleet Maintenance
- CPU allocation analysis with the Database Impact Advisor
- I/O load capacity evaluation with Exadata I/O Resource Management (IORM) Advisor
- Advanced, holistic IOPS allocations visualization for optimal Exascale performance monitoring
The key features of the ZDLRA pack include:
- Automated Database Fleet Backup Management for Recovery Appliance
- Database Fleet Advanced Backup Configuration for ZDLRA Using EM CLI
- Recovery Appliance Protected Database Fleet Archival Backup Scheduling
Source: https://blogs.oracle.com/observability/post/oem13c-ru23-is-now-available
Full list of new features can be found here:
So if you would like to have any of this new functionalities and better OCI integration with your Legacy OEM, please update first OMS to this version and then go to post related to OCI integration.
Below upgrade procedure.
Have fun!
Step-by-step OMS update Procedure Download patches zip files:
In our case we will need patches:
p19999993_135000_Generic.zip omspatcher
p36494040_135000_Generic.zip OMS
p36494050_135000_Generic.zip OEM agent
+ 3 patches needed to jdbs:
1.Patch 35430934
2.Patch 31657681
3.Patch 34153238
p35430934_122140_Generic.zip
p31657681_191000_Generic.zip
p35430934_122140_Generic.zip
We copied patches to work directory oms1:/u02/inst/
In our case we will need to backup 4 folders + enable on database restore point:
#1. check OMS
oracle@oms1:/u02/inst/omspatcher/ [em1350] emctl status oms;
Oracle Enterprise Manager Cloud Control 13c Release 5
Copyright (c) 1996, 2021 Oracle Corporation. All rights reserved.
WebTier is Up
Oracle Management Server is Up
JVMD Engine is Up
#2. stop OMS
oracle@oms1:/u02/inst/omspatcher/ [em1350] emctl stop oms -all
Oracle Enterprise Manager Cloud Control 13c Release 5
Copyright (c) 1996, 2021 Oracle Corporation. All rights reserved.
Stopping Oracle Management Server...
WebTier Successfully Stopped
Oracle Management Server Successfully Stopped
AdminServer Successfully Stopped
Oracle Management Server is Down
JVMD Engine is Down
#3. Do backup of OMS + database(just set guarantee restore point, dont forget to remove on the end of upgrade)
OMS:
mkdir /u02/inst/bkp
cd /u01/app/oracle/
tar cvf - em13c | gzip -9 > /u02/inst/bkp/em13c_bkp.tar.gz
tar cvf - gc_inst | gzip -9 > /u02/inst/bkp/gc_inst_bkp.tar.gz
tar cvf - swlib | gzip -9 > /u02/inst/bkp/swlib_bkp.tar.gz
cd /u01/app/
tar cvf - oraInventory | gzip -9 > /u02/inst/bkp/oraInventory_bkp.tar.gz
#Where folders em13c-OMS, gc_inst-weblogic, swlib- software libary
#4. Database garantee restore point:
SQL> create restore point EM13C_REPO_before_upgrade for pluggable database EM13C_REPO guarantee flashback database;
SQL> col name format a30;
SQL> select name , GUARANTEE_FLASHBACK_DATABASE,con_id from v$restore_point;
cd $ORACLE_HOME
mv OMSPatcher OMSPatcher_bkp
cp -rp /u02/inst/omspatcher/OMSPatcher /u01/app/oracle/em13c/
cd /u01/app/oracle/em13c/OMSPatcher/
./omspatcher version
OMSPatcher Version: 13.9.5.21.0
OPlan Version: 12.2.0.1.16
cd /u02/inst/oms/36494040/
emctl start oms;
/u01/app/oracle/em13c/OMSPatcher/omspatcher apply -analyze
OMSPatcher Automation Tool
Copyright (c) 2017, Oracle Corporation. All rights reserved.
OMSPatcher version : 13.9.5.21.0
OUI version : 13.9.4.0.0
Running from : /u01/app/oracle/em13c
Log file location : /u01/app/oracle/em13c/cfgtoollogs/omspatcher/opatch2024-10-01_15-52-13PM_1.log
WARNING:Apply the 12.2.1.4.0 version of the following JDBC Patch(es) on OMS Home before proceeding with patching.
1.MLR patch 35430934(or its superset),which includes bugs 32720458 and 33607709
2.Patch 31657681
3.Patch 34153238
OMSPatcher log file: /u01/app/oracle/em13c/cfgtoollogs/omspatcher/36494040/omspatcher_2024-10-01_15-52-21PM_analyze.log
Please enter OMS weblogic admin server URL(t3s://oms1:7102):>
Please enter OMS weblogic admin server username(oracle):>
Please enter OMS weblogic admin server password:>
Enter DB user name : sysman
Enter 'SYSMAN' password :
Checking if current repository database is a supported version
Current repository database version is supported
Prereq "checkComponents" for patch 36487738 passed.
Prereq "checkComponents" for patch 36329231 passed.
Prereq "checkComponents" for patch 36487989 passed.
Prereq "checkComponents" for patch 36487743 passed.
Prereq "checkComponents" for patch 36487747 passed.
Prereq "checkComponents" for patch 36161799 passed.
Prereq "checkComponents" for patch 36487829 passed.
Configuration Validation: Success
Running apply prerequisite checks for sub-patch(es) "36487738,36487747,36487989,36487829,36487743,36161799,36329231" and Oracle Home "/u01/app/oracle/em13c"...
Sub-patch(es) "36487738,36487747,36487989,36487829,36487743,36161799,36329231" are successfully analyzed for Oracle Home "/u01/app/oracle/em13c"
Complete Summary
================
All log file names referenced below can be accessed from the directory "/u01/app/oracle/em13c/cfgtoollogs/omspatcher/2024-10-01_15-52-13PM_SystemPatch_36494040_1"
Prerequisites analysis summary:
-------------------------------
The following sub-patch(es) are applicable:
Featureset Sub-patches Log file
---------- ----------- --------
oracle.sysman.top.oms 36487738,36487747,36487989,36487829,36487743,36161799,36329231 36487738,36487747,36487989,36487829,36487743,36161799,36329231_opatch2024-10-01_15-52-52PM_1.log
The following sub-patches are incompatible with components installed in the OMS system:
35582217,34430509,34706773,36487761,35854914,36329046,36487844,36329220
--------------------------------------------------------------------------------
The following warnings have occurred during OPatch execution:
1) Apply the 12.2.1.4.0 version of the following JDBC Patch(es) on OMS Home before proceeding with patching.
1.MLR patch 35430934(or its superset),which includes bugs 32720458 and 33607709
2.Patch 31657681
3.Patch 34153238
--------------------------------------------------------------------------------
Log file location: /u01/app/oracle/em13c/cfgtoollogs/omspatcher/36494040/omspatcher_2024-10-01_15-52-21PM_analyze.log
OMSPatcher succeeded.
#in our case we install additionally 3 more patches:
1.MLR patch 35430934(or its superset),which includes bugs 32720458 and 33607709
2.Patch 31657681
3.Patch 34153238
oracle@oms1:/u02/inst/jdbc_patches/ [em1350] ls -al
total 1908
drwxr-xr-x. 2 oracle oinstall 114 Oct 1 16:20 .
drwxrwxrwx. 7 oracle oinstall 79 Oct 1 16:19 ..
-rw-r--r--. 1 oracle oinstall 468428 Oct 1 14:15 p31657681_191000_Generic.zip
-rw-r--r--. 1 oracle oinstall 58787 Oct 1 14:15 p34153238_122140_Generic.zip
-rw-r--r--. 1 oracle oinstall 1418389 Oct 1 14:15 p35430934_122140_Generic.zip
unzip p35430934_122140_Generic.zip
unzip p31657681_191000_Generic.zip
unzip p35430934_122140_Generic.zip
1.
emctl stop oms
cd /u02/inst/jdbc_patches/35430934/
/u01/app/oracle/em13c/OPatch/opatch apply
2.
cd /u02/inst/jdbc_patches/34153238
/u01/app/oracle/em13c/OPatch/opatch apply
3.
cd /u02/inst/jdbc_patches/31657681/
/u01/app/oracle/em13c/OPatch/opatch apply
oracle@oms1:/u02/inst/oms/36494040/ [em1350] /u01/app/oracle/em13c/OMSPatcher/omspatcher apply -analyze
OMSPatcher Automation Tool
Copyright (c) 2017, Oracle Corporation. All rights reserved.
OMSPatcher version : 13.9.5.21.0
OUI version : 13.9.4.0.0
Running from : /u01/app/oracle/em13c
Log file location : /u01/app/oracle/em13c/cfgtoollogs/omspatcher/opatch2024-10-01_16-32-12PM_1.log
OMSPatcher log file: /u01/app/oracle/em13c/cfgtoollogs/omspatcher/36494040/omspatcher_2024-10-01_16-32-21PM_analyze.log
Please enter OMS weblogic admin server URL(t3s://oms1:7102):>
Please enter OMS weblogic admin server username(oracle):>
Please enter OMS weblogic admin server password:>
Enter DB user name : sysman
Enter 'SYSMAN' password :
Checking if current repository database is a supported version
Current repository database version is supported
Prereq "checkComponents" for patch 36487738 passed.
Prereq "checkComponents" for patch 36329231 passed.
Prereq "checkComponents" for patch 36487989 passed.
Prereq "checkComponents" for patch 36487743 passed.
Prereq "checkComponents" for patch 36487747 passed.
Prereq "checkComponents" for patch 36161799 passed.
Prereq "checkComponents" for patch 36487829 passed.
Configuration Validation: Success
Running apply prerequisite checks for sub-patch(es) "36487738,36487747,36487989,36487829,36487743,36161799,36329231" and Oracle Home "/u01/app/oracle/em13c"...
Sub-patch(es) "36487738,36487747,36487989,36487829,36487743,36161799,36329231" are successfully analyzed for Oracle Home "/u01/app/oracle/em13c"
Complete Summary
================
All log file names referenced below can be accessed from the directory "/u01/app/oracle/em13c/cfgtoollogs/omspatcher/2024-10-01_16-32-12PM_SystemPatch_36494040_1"
Prerequisites analysis summary:
-------------------------------
The following sub-patch(es) are applicable:
Featureset Sub-patches Log file
---------- ----------- --------
oracle.sysman.top.oms 36487738,36487747,36487989,36487829,36487743,36161799,36329231 36487738,36487747,36487989,36487829,36487743,36161799,36329231_opatch2024-10-01_16-32-43PM_1.log
The following sub-patches are incompatible with components installed in the OMS system:
35582217,34430509,34706773,36487761,35854914,36329046,36487844,36329220
Log file location: /u01/app/oracle/em13c/cfgtoollogs/omspatcher/36494040/omspatcher_2024-10-01_16-32-21PM_analyze.log
OMSPatcher succeeded
Now the output is with without warning so its safe to proceed with patching
Update OMS
emctl stop oms;
cd /u02/inst/oms/36494040/
/u01/app/oracle/em13c/OMSPatcher/omspatcher apply
oracle@oms1:/u02/inst/oms/36494040/ [em1350] /u01/app/oracle/em13c/OMSPatcher/omspatcher apply
OMSPatcher Automation Tool
Copyright (c) 2017, Oracle Corporation. All rights reserved.
OMSPatcher version : 13.9.5.21.0
OUI version : 13.9.4.0.0
Running from : /u01/app/oracle/em13c
Log file location : /u01/app/oracle/em13c/cfgtoollogs/omspatcher/opatch2024-10-01_16-48-53PM_1.log
OMSPatcher log file: /u01/app/oracle/em13c/cfgtoollogs/omspatcher/36494040/omspatcher_2024-10-01_16-49-02PM_apply.log
Please enter OMS weblogic admin server URL(t3s://oms1:7102):>
Please enter OMS weblogic admin server username(oracle):>
Please enter OMS weblogic admin server password:>
Enter DB user name : sys
Enter 'sys' password :
Checking if current repository database is a supported version
Current repository database version is supported
Prereq "checkComponents" for patch 36487738 passed.
Prereq "checkComponents" for patch 36329231 passed.
Prereq "checkComponents" for patch 36487989 passed.
Prereq "checkComponents" for patch 36487743 passed.
Prereq "checkComponents" for patch 36487747 passed.
Prereq "checkComponents" for patch 36161799 passed.
Prereq "checkComponents" for patch 36487829 passed.
Configuration Validation: Success
Running apply prerequisite checks for sub-patch(es) "36487738,36487747,36487989,36487829,36487743,36161799,36329231" and Oracle Home "/u01/app/oracle/em13c"...
Sub-patch(es) "36487738,36487747,36487989,36487829,36487743,36161799,36329231" are successfully analyzed for Oracle Home "/u01/app/oracle/em13c"
To continue, OMSPatcher will do the following:
[Patch and deploy artifacts] : Apply sub-patch(es) [ 36161799 36329231 36487738 36487743 36487747 36487829 36487989 ]
Apply RCU artifact with patch "/u01/app/oracle/em13c/.omspatcher_storage/36487738_Aug_3_2024_00_57_51/original_patch";
Apply RCU artifact with patch "/u01/app/oracle/em13c/.omspatcher_storage/36329231_May_8_2024_02_03_34/original_patch";
Apply RCU artifact with patch "/u01/app/oracle/em13c/.omspatcher_storage/36487989_Aug_3_2024_02_10_51/original_patch";
...
...
...
emctl_register_gccompliance_2024-10-01_17-34-40PM.log
MRS-SecurityClassManager emctl_register_SecurityClassManager_2024-10-01_17-43-22PM.log
MRS-OutOfBoxRole emctl_register_OutOfBoxRole_2024-10-01_17-43-27PM.log
MRS-report emctl_register_report_2024-10-01_17-43-32PM.log
MRS-namedsql emctl_register_namedsql_2024-10-01_17-43-36PM.log
MRS-runbooks emctl_register_runbooks_2024-10-01_17-43-41PM.log
MRS-derivedAssocs emctl_register_derivedAssocs_2024-10-01_17-43-45PM.log
MRS-derivedAssocs emctl_register_derivedAssocs_2024-10-01_17-43-52PM.log
MRS-derivedAssocs emctl_register_derivedAssocs_2024-10-01_17-43-56PM.log
Log file location: /u01/app/oracle/em13c/cfgtoollogs/omspatcher/36494040/omspatcher_2024-10-01_16-49-02PM_apply.log
OMSPatcher succeeded
#Execute Sync EM:
emctl sync;
#Restart OMS:(just to check that everything is OK)
emctl stop oms -all
emctl start oms
Oracle Enterprise Manager Cloud Control 13c Release 5
Copyright (c) 1996, 2021 Oracle Corporation. All rights reserved.
Starting Oracle Management Server...
WebTier Successfully Started
Oracle Management Server Successfully Started
Oracle Management Server is Up
JVMD Engine is Up
Re-login to website, Click About Enterprise Manager on user drop down menu and check version:
Go to Manage Cloud Control > Health Overview and check overall oms status.
If all above operations successful, you can remove now database restore point:
Removing Restore point
#Remove flashback guarantee point on OMS REPO databases:
SQL> drop restore point EM13C_APP_001T_before_upgrade for pluggable database EM13C_REPO ;
SQL> col name format a30;
SQL> select name , GUARANTEE_FLASHBACK_DATABASE,con_id from v$restore_point;
Now we can update OEM Agents.
There is many options to update /patch OEM Agents, please refer to oracle documentation which is most suitable for you environment. In this tutorial we will use automated ‘Provisioning and Patching’ using OEM.
Please go to Enterprise > Provisioning and Patching > Saved Patches and upload patch p36494050_135000_Generic.zip for OEM agent
After successful upload patch to OEM click patch number and create Deployment Plan
name the plan and choice targets for update:
After plan is create go to Enterprise > Provisioning and Patching > Patches update > choice your plan
Then Analyze plan, and Deploy > you will be able to see details in Jobs view.
After deployment finish you should have already agents updated and restarted of all nodes which you choice during plan Creation
If you have many agents some of them can have issue during upgrade , usually manual agent restart helps, if not then each case need to be investigate separately , in worse case you can also just redeploy Agent on monitoring server with new image.
That’s it!
We update OMS and OEM Agents to 13.5.0.23
Now if you interested in using new Exadata Cloud @ Customer integration OEM futures below link to my post how to do this:
L’article OEM Cloud Control Release update to 13.5.0.23 est apparu en premier sur dbi Blog.
M-Files to unify your data
Two weeks ago, I was at the #SITF24 to talk about integration of External Repositories (SharePoint/Teams, shared folder,…) into M-Files. Surprisingly this is still a tricky topic for companies.
As an ECM consultant since years, I’m probably a bit too naive and think documents are already well managed in most of the firms. I got some feedback that prove me the reality is different.
What is the need?Based on a study from Gartner conducted in 2023, people who work with data (producing documents and so on) have to deal with an average a 11 applications, this value increase if these people are working in an IT department.
This situation leads automatically to some problems like:
- Confusion: where to store my document, where is the previous version?
- Duplicates: Document recreated because it was not stored a the right place.
- Compliance: a new version of a document has been created, but not from the latest version because the collaborator was not able to identify the current version of the document.
- Security: document not stored a the right place thus right permissions are not applied.
Luckily, in general all these cases do not apply at the same time, but these are common issues companies are facing.
How M-Files can help?M-Files has several interesting features to help the users.
The idea is to place M-files as the entry point for the user, as soon as someone from the company needs some data the reflex will be to go to M-Files.
Prior to that the administrator have configured some external repositories to connect other applications: network shares, SharePoint sites/document libraries, MS Teams channels, but also Google workplaces, Amazon S3…
You may want to connect also other applications to gather data which are not documents like Salesforce, SAP, Microsoft Dynamics CRM… or even a simple database.
You can retrieve all the existing connectors here. If you aren’t able to find what you need you can create your own.
Once the connections are made, M-Files starts indexing the data. After some time (depends the amount of data to index, of course), the users will be able to search in M-files and find data stored outside.
It’s already good but M-files can do more.
Intelligent Metadata LayerIn previous posts (like here) I already talked about IML, the AI capabilities of M-Files.
This time I will introduce another module, M-files Discovery.
With this module the M-Files magic appear!
M-Files Discovery can automatically classify and categorize documents stored on external repositories.
What does it means? Simply no matter where the documents are, if M-files was able to classify them, they will appear in the M-files Views.
Then if a workflow is defined for this class of document, it can follow the different steps exactly like a native M-Files document.
Discovery can also detect Personal Identifiable Information, it can be really interesting to define specific scenario in that case and ensure the compliance.
Additionally, with the M-files duplicate detection you can easily identify a document stored in multiple repositories (c.f. previous screenshot, “duck-invoice.pdf” is stored in 3 different repositories).
As usual if you have any further questions, feel free to reach us and don’t hesitate to tell me f you want more technical information, I will be happy to write another blog on that topic.
L’article M-Files to unify your data est apparu en premier sur dbi Blog.
Developing a Business Central Custom API for Power BI Integration – Resolving the ‘Could Not Download Symbols’ Error
Dear Readers,
This article marks my first blog post. I am both excited and eager to share my experience of tackling a challenging issue encountered while developing a custom API for Microsoft Business Central to enable data extraction for Power BI reports. I hope this article will be informative and help others avoid the pitfalls I faced.
ContextAs part of the Business Central online implementation, I was tasked with enhancing the reporting solution by integrating additional Power BI reports. The process involved the following steps:
- Extract data from Business Central.
- Build Power BI reports.
- Publish reports to the Power BI Service.
Being new to Business Central, my first task was to explore how to connect it with Power BI. After some research, I found that there are two primary methods to expose data from Business Central for use in Power BI reports:
- API Pages or Queries: A method for exposing data through a structured interface.
- Open Data Protocol (OData) Web Services: Allows data to be accessed via a standardized protocol.
Microsoft strongly recommends using REST API web services for integration with Power BI, as highlighted in the following articles:
- Choosing Whether to Use API Pages or OData Web Services
- Compare REST APIs, SOAP, and OData Web Services
Based on this, I decided to use API Pages, as they provide a consistent and reliable interface for integrating with external applications.
To meet specific business needs, I needed to implement custom APIs for the following reasons: Although standard APIs handle common processes and entities such as customers, vendors, and sales orders, they often lack support for custom fields or more complex business logic. Custom APIs, however, provide a flexible integration solution, allowing developers to build APIs that are tailored to meet unique workflows and specialized needs.
Problem StatementMy task was to develop a custom API to extract Business Central data into Power BI. I quickly set up my development environment using Visual Studio Code and the AL Language extension for Microsoft Dynamics 365 Business Central. The necessary downloads can be accessed via the following links:
- Download Visual Studio Code.
- Download the AL Language extension for Microsoft Dynamics 365 Business Central.
Confident that everything was ready, I began developing the API through following steps:
Phase 1: Sign up to Business Central
Phase 2: VS Code new AL project with target platform 13.0
Project contents
File launch.json
File app.json
Phase 3: Download symbols
Downloading symbols is crucial for accessing existing functionality, ensuring proper compilation, and using development tools effectively in Business Central extension development.
I executed the command to download the symbols via Ctrl + Shift + P then AL: Download symbols
Then following error messages occurred:
Despite various attempts to resolve the problem, such as installing the Azure Account extension, clearing the Visual Studio Code cache, and creating new projects with different target platform versions, the issue persisted.
Root Cause AnalysisUpon investigating the root cause, I discovered that the problem lay in the configuration settings of the development environment. Specifically, the environmentName parameter in the launch.json file was incorrectly configured.
The development environment failed to properly connect to Business Central, which prevented the download of the necessary symbols. This issue can be attributed to incorrect URLs or missing settings that blocked the symbol download process.
SolutionInitiating the download symbols process relies on the current configuration in the launch.json file. Therefore, I modified the environmentName parameter in the launch.json file to reference the correct Business Central instance, as demonstrated below:
These details can be obtained from the Business Central session that I initiated during the first phase of the development process. After making that update, I successfully downloaded the symbols, which resolved the error and displayed the following confirmation message:
This allowed me to proceed with the development of the custom API.
ConclusionResolving the “Could not download symbols” error in Business Central requires careful configuration of the development environment and proper permission settings. By following the steps outlined above, you can avoid this issue and ensure smooth integration between Business Central and Power BI.
Stay tuned for more insights in my upcoming posts, where I will delve into implementing a custom calendar feature that integrates data from Business Central and enhances logic within Power BI.
L’article Developing a Business Central Custom API for Power BI Integration – Resolving the ‘Could Not Download Symbols’ Error est apparu en premier sur dbi Blog.
SQL Konferenz 2024 – Microsoft Copilots
Today, we followed (with Roger Schönmann) the first day of the SQL Konferenz 2024. We will be attending this event from October 01 to 02 in Hanau.
At dbi services, we are no strangers to this conference, as we have been sending members of our team for several years now.
We consider the sessions to be of high quality and right on trend, and this year is no exception.
If you would like to know the details of the sessions given on this first day, you can find them in this Agenda.
For my part, here is a list of the sessions I attended today:
- Taming the Datamonster with intelligent AI using SQL and Microsoft
- Here’s the execution plan … now what?
- Thwarting Disaster for SQL Server on Azure VM
- Query Store: The simplest way to Troubleshooting Performance
- ACID in SQL Server: what it is, how it works, and how to live more adventurously
- Welcome to the age of SQL Copilots
There were also some more Data Engineer-oriented sessions.
In this blog, I have chosen to focus on Bob Ward’s session on Copilot. Indeed, I feel that this is an attractive topic to focus on.
Copilots are Microsoft AI-driven assistants designed to support and assist you in various IT-tasks.
You will be able to meet them in a variety of environments and contexts:
- Microsoft Copilot (website and standalone application)
- Microsoft Copilot for Azure (embedded in Azure – In Preview)
- Microsoft Copilot for Microsoft 365 (integrated with all apps and documents)
- Microsoft Copilot for Security (AI with a generative + AI-powered assistant)
- Copilot in Windows (integrated directly into the Windows OS)
- Copilot in Fabric (In Preview)
- GitHub Copilot (code development)
- Copilot in Azure SQL Database (assistance and T-SQL code generation – In Preview)
As you can see, Copilot AI is flooding into Microsoft products at every level. In time, these tools will become part of our daily work.
Among this variety of Copilot offerings, Bob Ward focused his presentation on Microsoft Copilot for Azure and Copilot for Azure SQL Database.
According to Microsoft’s vision, these Copilots aim to:
- Improve efficiency of several topics: performance, security, availability…
- Design and optimize: schemas, T-SQL code…
- Simplify: database management, reduce complexity of SQL…
Before going further into this session, it is important to remind that we interact with AI through prompts. These are instructions or questions that guide Copilot to produce responses or perform specific tasks. Bob Ward shared a non-exhaustive list of prompts to ask in the context of a performance problem:
Later on, to explain Copilot’s potential, several demos were presented for different use cases:
- A current performance issue is occurring at database level
- Data must be transformed in a specific manner
- Retrieve the top 20 consuming queries in the past 4 days
- …
In the below demo “my database is slow”, Copilot was able proactively to determine that the high CPU utilization would be a relevant piece of information to share with.
In addition, it returned all the CPU metrics and the query which implied this high usage. Of course, the performance problem was quite obvious in this scenario. Usually, Monitoring tool has already informed you of excessive CPU usage and which query is incriminated.
In fact, the added value is that you get it interactively and with a detailed report.
What would be interesting is Copilot’s behavior in a more complex case, where the performance problem is not obvious.
In addition, here are some considerations when using Copilot for SQL Database in Azure:
- Use the database blade context when using prompts to improve efficiency
- Be aware of the tasks you are asking for: some may take much more longer than others
- Copilot only access to resources we have access to. It cannot perform analysis or actions if we not have the necessary permissions
AI-driven assistants have undeniable potential and much to contribute to our day-to-day work. We are clearly in the early stages of using them, and perspectives are exciting.
There is great potential for efficiency gains and simplification in many IT projects, but we must not fall into the trap of trying to apply them at any cost to something that is already effective.
L’article SQL Konferenz 2024 – Microsoft Copilots est apparu en premier sur dbi Blog.
Future Data Driven 2024 – introduction to vector databases
I had the opportunity to participate in the Future Data Driven 2024 event (https://datadrivencommunity.com/FutureDataDriven2024.html) with my colleague Stéphane Haby.
Many speakers were present, and I was able to attend several sessions covering the following topics: Azure, AI, CosmosDB, Co-Pilot, Database Watcher. In several sessions, when AI was discussed, we talked about vector databases.
These were terms I had heard before but never really understood. So, I would like to present how vector databases are used.
Before talking about vector databases, it is necessary to cover certain concepts, which are as follows:
- Vectors
- Vector embeddings
- Cosine similarity
General Information
Vectors are mathematical objects that consist of both a magnitude (or length) and a direction. We can graphically represent a vector (in a 2-dimensional plane) as follows:
A 2-dimensional vector can be written as follows:
In our case, we can write it as:
We also find another notation to describe vectors, which is as follows:
Vectors are multidimensional, and their notation can be as follows for a 4-dimensional vector:
A first problem arises here. Indeed, when multiple vectors are described, this initial form of notation can be confusing. In this case, we can introduce a new notation:
x and y are the components of vector a.
Length or Magnitude of a Vector:
We mentioned earlier that one of the properties of a vector is its length or norm. This is calculated as follows for a vector:
In the case of the vector described above, this results in:
Thus, we can see that this formula corresponds to the calculation of the hypotenuse. The length of the vector is the value of the hypotenuse.
Vectors embeddingsVector embeddings allow us to represent data (texts, images, etc.) in the form of vectors. One way to describe images using a relational database system would be to use several tables:
Table nameDetailsImageContains imagesTagContains words that describes imagesDescriptionAssociate an image with the words, the tags that describe itBy using certain SQL operators, it should be possible to find images that are described as similar (because they share the same tags).
Vector embeddings represent data differently. Several techniques exist for this, including:
- Word2Vec: A method that uses neural networks to create vector representations of words.
- GloVe (Global Vectors for Word Representation): A technique that uses co-occurrence statistics to create embeddings.
- BERT (Bidirectional Encoder Representations from Transformers): An advanced model that generates contextual embeddings, taking into account surrounding words.
Thus, vector databases use vector embeddings. Once these embeddings are generated, they are stored in vector databases. Then, it is possible to use cosine similarity to evaluate how similar the vectors are. Finally, vector databases index the vectors, and there are several methods:
- Tree-based indexing
- Approximate Nearest Neighbors
- Quantization
Cosine similarity allows for a precise evaluation of the similarity between two vectors. Here is an example:
- Imagine that I have two recipe cards. I would like to determine if these two recipes are similar based on the ingredients mentioned in them.
- So, I have two recipes: R1 and R2
- R1: in this recipe, the word “egg” is mentioned 6 times and the word “flour” is mentioned 1 time.
- R2: in this recipe, the word “egg” is mentioned 1 time and the word “flour” is mentioned 8 times.
- So, I have two recipes: R1 and R2
Based on this information, it is possible to graphically represent how many times these words appear.
The first vector represents my first recipe. The second vector represents my second recipe.
Cosine similarity will determine if these two recipes are similar (this is a very simplified example). To do this, we need to calculate the cosine of 81 degrees (which is 0.15643447). Here, on the graph, the angle formed between the two vectors is 81 degrees.
Here are more details on how to calculate cosine similarity:
It is possible to represent recipes R1 and R2 by two vectors, which are as follows:
So we have:
The values of cosine similarity can range between -1 and 1. Here, this value indicates that the two recipes are not very similar.
Pinecone, vector databases, python implementationPinecone is a vector database. It is designed to store, manage, and search vectors, enabling similarity search operations and analyses based on unstructured data.
Here is an example created from the documentation that illustrates how to use vector databases and calculate similarities between vectors:
Code :
from pinecone.grpc import PineconeGRPC as Pinecone
from pinecone import ServerlessSpec
pc = Pinecone(api_key='MyAPIKey')
index_name = "example-index"
pc.create_index(
name=index_name,
dimension=2,
metric="cosine", #we use the cosine similarity
spec=ServerlessSpec(
cloud='aws',
region='us-east-1'
)
)
while not pc.describe_index(index_name).status['ready']:
time.sleep(1)
index = pc.Index(index_name)
index.upsert(
vectors=[
{"id": "vec1", "values": [6,1]}, #we define our first vector
{"id": "vec2", "values": [1,8]}, #we define our second vector
],
namespace="example-namespace1"
)
query_results1 = index.query(
namespace="example-namespace1",
vector=[1,7],
top_k=3,
include_values=True
)
print(query_results1)
Here is the result:
We had as input a vector with the value of [1, 7]. This vector was compared to the other two vectors, and the cosine similarity (the score) was determined.
If you encounter any errors while running this script, it is possible that your index already exists or has not yet been created.
L’article Future Data Driven 2024 – introduction to vector databases est apparu en premier sur dbi Blog.