Query CSV files using command line SQL-like csvq tool

Target audience: c#, SQL Coders
Used tools: c#, SQL
What’s the purpose: Read, update, delete CSV records with SQL-like query tool

his will be a story about how I discover CSVG – SQL-like query tool for csv files. I have to resolve the following problem : my colleague make a monthly report for parcels. She generate this report in CSV file from external application – we were only end users, without possibility to change this app. It has a problem with Cyrillic text – it was coded in UTF-8 BOM. This report contains parcel ID, name of receiver, his name and other data.

Something like this : billDet.CSV

List of parcels;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Company: NAME;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Start date: 01.07.2019;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
End date: 31.07.2019;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Info:;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
ID;Sending date;Sender;Receiver;Country Receiver;;;;;;;;;;;;Price
EE009999994NN;01.07.2019 14:26;SOME;ENGLAND;GARETH JAY;;;;;;;;;;;;;53,00
EE009999963NN;02.07.2019 16:29;SOME;GERMANY;HENDRIK MOON;;;;;;;;;;;;;60,00
EE009999901NN;05.07.2019 14:35;SOME;SPAIN;PALMERO DIAS;;;;;;;;;;;;;56,40
...

In this file we can mark 2 features :

  1. First 6 rows only garbage
  2. And decimal symbol is comma, not decimal point – for example 53,00 and must become 53.00

Now, we have to done the following:

  1. clear first 6 rows from csv file, before make a query
  2. and replace comma with decimal point

Well, these 2 problems can be resolved with BAT file or with VBA macro in Excel. But, for Excel method there must have an installed office on the user PC – so, this method drop out. But then arise the next problem – how to make SQL query in BAT file ?
And – here is the place for tool that I found for parsing CSV file, using SQL queries CSVQ.exe. Of course, there are other such tools. But this tool was power enough to fulfill by job – to make SQL query from the command prompt without any additional installation – I have to copy only one little exe – csvq.exe with 10M size.

Now, I can make a BAT file for resolving the task or even make a C# console app. I choose the second option – C# console app . But before this, there are some preliminary work to do .
My colleague wants this report to be group by regions to which these countries belongs.The problem was that that this app have not such report and even have not such data field as region of the country. Also this app was not made by our firm, so this new report have to be made outside of the app. So, what we have to done:

  1. To make additional file with name of country and number of region, to which this country belongs
  2. To make query from app’s generating report and additional file for country’s region from above in 1

Now, we make a c.CSV file with such content :

1;AFGHANISTAN;3
2;AUSTRALIA;3
3;AUSTRIA;2
4;AZERBAIJAN;2
...

Columns in this CSV file are : Serial number,Name of the country and Number of the zone to which this country belongs.

And now – the last – C# code for app :

static void Main(string[] args)
  {           
    try
        {
          System.Diagnostics.Process.Start("CMD.exe", "/C more +6 ee.csv >e9.csv");

          string strSQL1 = "\"SELECT c.c3 AS Zona,c.c2,e.c34,e.c69 AS Sbor_lv from e inner join c on e.c32=c.c2 order by c.c3,c.c2\"";
          string strSQL2 = "\"SELECT c.c3 AS Zona,c.c2 AS State,number_format(round(SUM(e.c69),2),2,'.',' ') AS Sbor_lv,count(*) AS Broj from e inner join c on e.c32=c.c2 GROUP BY c.c3,c.c2 ORDER BY c.c3,c.c2\"";
          string strSQL3 = "\"SELECT string('---------------------------------------------------') AS O,c.c3 AS Zona,number_format(round(SUM(e.c69),2),2,'.',' ') AS Sbor_lv,COUNT(c.c3) AS Broj from e inner join c on e.c32=c.c2 GROUP BY c.c3\"";
          string strSQL4 = "\"SELECT string('----------------------------------------------------------') AS OBSTA_SUMA,number_format(round(SUM(e.c69),2),2,'.',' ') AS Sbor_lv,count(*) AS Broj from e inner join c on e.c32=c.c2\"";

          string strCmdText1 = "/C csvq.exe -n -d \";\" " + strSQL1 + " >T1"; // & TYPE T1";
          string strCmdText2 = "/C csvq.exe -n -d \";\" " + strSQL2 + " >T2"; // & TYPE T2";
          string strCmdText3 = "/C csvq.exe -n -d \";\" " + strSQL3 + " >T3"; // & TYPE T3";
          string strCmdText4 = "/C csvq.exe -n -d \";\" " + strSQL4 + " >T4";  //& TYPE T4";
          System.Diagnostics.Process.Start("CMD.exe", strCmdText1);
          System.Diagnostics.Process.Start("CMD.exe", strCmdText2);
          System.Diagnostics.Process.Start("CMD.exe", strCmdText3);
          System.Diagnostics.Process.Start("CMD.exe", strCmdText4);

          System.Threading.Thread.Sleep(2000); //2 sec . 1 sec is not enouth !!!
          System.Diagnostics.Process.Start("CMD.exe", "/C COPY T1+T2+T3+T4 TA.txt");
          System.Diagnostics.Process.Start("CMD.exe", "/C notepad.exe TA.txt");
        }
    catch (Exception ex)
       {
          Console.WriteLine(ex.ToString());
       }
  }

IMPORTANT : I sincerely recommend to check this utility CSVQ.exe – it’s really great – it’s little, handy and works !

https://mithrandie.github.io/csvq/

The main problem is documentation – it has a doc, but there must be more examples of using it ! You have to be too stubborn find some useful examples in internet !

Published
Categorized as SQL