Split Large Text/CSV File into Multiple Files in PL SQL

Sometimes it happens that you have a very large text or CSV file to process, but first you want to make smaller files of that large file.  Because that large file may take too much time to process or open. So I am giving an example below to split large text/CSV file into multiple files in PL SQL using stored procedure.

You just need to pass two parameters to this PL SQL procedure, first is database directory object name, where the text files are residing and the second is the source file name (the file which you want to split).

If Oracle directory object is not exists for the location of text files, then you can create it as shown below:

For windows:
CREATE OR REPLACE DIRECTORY CSV_FILE_DIR AS 'D:\plsql\text_files';
For Linux/Unix (due to difference in path):
CREATE OR REPLACE DIRECTORY CSV_FILE_DIR AS '/plsql/text_files';

Change the path above according to yours files location. Then create the below procedure by executing its script:

CREATE OR REPLACE PROCEDURE split_file (p_db_dir IN VARCHAR2,
 p_file_name IN VARCHAR2)
IS
 read_file UTL_FILE.file_type;
 write_file UTL_FILE.file_type;
 v_string VARCHAR2 (32767);
 j NUMBER := 1;
BEGIN
 read_file := UTL_FILE.fopen (p_db_dir, p_file_name, 'r');

WHILE j > 0
 LOOP
 write_file := UTL_FILE.fopen (p_db_dir, j || '_' || p_file_name, 'w');

FOR i IN 1 .. 100
 LOOP -- example to dividing into 100 rows for each file.. you can increase the number as per your requirement
 UTL_FILE.get_line (read_file, v_string);
 UTL_FILE.put_line (write_file, v_string);
 END LOOP;

UTL_FILE.fclose (write_file);
 j := J + 1;
 END LOOP;
EXCEPTION
 WHEN OTHERS
 THEN
 -- this will handle if reading source file contents finish
 UTL_FILE.fclose (read_file);
 UTL_FILE.fclose (write_file);
END;

This procedure splitting 100 rows for each file, which you can modify as per your need. Now execute this procedure as shown below by passing database directory object name and the file name:

BEGIN
 split_file ('CSV_FILE_DIR', 'text_file.csv');
END;

You can check your file location (CSV_FILE_DIR) for the multiple files starting with numbers like 1_text_file.csv, 2_text_file.csv and so on, as shown in below image:

split large text/csv file into multiple files in pl sql

Vinish Kapoor
Vinish Kapoor

Vinish Kapoor is a seasoned software development professional and a fervent enthusiast of artificial intelligence (AI). His impressive career spans over 20 years, marked by a relentless pursuit of innovation and excellence in the field of information technology. As an Oracle ACE, Vinish has distinguished himself as a leading expert in Oracle technologies, a title awarded to individuals who have demonstrated their deep commitment, leadership, and expertise in the Oracle community.

8 Comments

  1. Hi,

    I tried above option. What I realized is the main file e.g. test_file contains 100 Records and the sum of split files is not equal to 100. i.e. sum of records from 1_test_file and 2_test_file is not equal to records in test_file.

    • In how many rows you divided? For example, if you will divide in 40 rows, then the first file will contain 40 records, 2nd 40 and the 3rd will contain 20.

    • I have modified split function to split files in the interval of 3000.
      Total records were 9886.

      3000 1st split
      3000 2nd split
      3000 3rd split
       674 4th split
      ====
      9674

      9886 - 9674 = 212 difference.
      i.e. I am getting short of 212 records when I did the additions of 4 split files.

    • It seems somewhere is error is happening that is why it is not generating the last file completely.

      Try to remove the exception section of the above code temporarily.

      Then run the program again, you will get the error if something is wrong.

    • Sure, I will try that option. In fact I checked the data which is not getting picked up in the last split file. But to my surprise I did not see anything suspicious that could prevent it from picking up in the last split file.

    • Hi,

      I would like to confirm that now I am able to get the exact count as expected. i.e. Record Count from the Main file and the sum of record counts from various split files are exactly matching.

      It was my bad. I was calling split_file function immediately after "END LOOP".
      It should have been called after the "UTL_FILE.fclose (v_filehnd); -- CLOSE FILE" as shown below.

        END LOOP;
        UTL_FILE.fclose (v_filehnd); -- CLOSE FILE
        split_file ('FTP_OUTBOX', '*******.csv');  

      No need of removing exception from the split_file function.
      Thanks a lot for your kind support and patience.

Comments are closed.