Delphi Pages Forums  

Go Back   Delphi Pages Forums > Delphi Forum > General

Lost Password?

Closed Thread
 
Thread Tools Display Modes
  #1  
Old 10-05-2017, 01:40 PM
Luiz Eduardo Luiz Eduardo is offline
Member
 
Join Date: May 2014
Posts: 34
Question How remove complete path and http/https prefix of a url?

How remove complete path and http/https prefix of a url in a txt file?


I have several url (with complete path) in a text file and wants remove of each url the complete path and also your http/https prefix and following save all to same file.

ex:

Code:
https://github.com/markjaquith/WordPress/blob/master/
then i need that each line in file can be equal to this:

Code:
github.com
Any idea?
  #2  
Old 10-05-2017, 02:37 PM
rojam rojam is offline
Senior Member
 
Join Date: Jun 2015
Posts: 173
Default

I haven't tested this code, but the basic premise is to use a regular expression to match the domain and to add it to another StringList, returning the StringList with the matched values. Make sure you FREE the returned StringList from where you are calling this function.

Code:
uses System.RegularExpressions, ...

....

TForm1 = class(TForm)
  ..
private
  function URLtoDomain(const FileName: String): TString;
end;

function TForm1.URLtoDomain(const FileName: String): TStringList;
var
  Regex: TRegex;
  S: String;
  Match: TMatch;
  Pattern: String;
  FileSL: TStringList;
begin
  FileSL := TStringList.Create;
  try
    FileSL.LoadFromFile(FileName);
    Pattern := '(([a-z0-9-%])+\.)*((a(ero|rpa))|(biz)|(cat|(co(m|op)))|(edu)|(gov)|(in(fo|t))|(jobs)|(m(il|obi|useum))|(name|net)|(om|org)|(pro)|(qa)|(ru)|(travel))';
    Result := TStringList.Create;
    Regex := TRegex.Create(Pattern, [roIgnoreCase]);
    for S in FileSL do
    begin
      Match := Regex.Match(S);
      while Match.Success do
      begin
        Result.Add(Match.Value);
        Match := Match.NextMatch;
      end;
    end;
  finally
    FileSL.Free;
  end;
end;
use
Code:
var
  File: String;
  SL: TStringList;
begin
  File := 'C:\Path\to\my\file.txt';
  SL := URLtoDomain(File);//this function creates the SL so ensure to FREE it
  try
    //SL now contains your new values
    //do what you will with them
    SL.SaveToFile('c:\Path\to\New\File.txt');
  finally
    SL.Free;
  end;
end;
  #3  
Old 10-05-2017, 02:57 PM
Norrit Norrit is offline
Moderator
 
Join Date: Aug 2001
Location: Landgraaf
Posts: 7,222
Default

I would search for "://" and Copy from this to the next /
Just a simple Pos and PosEx with a Copy

Only thing you need to do first is StringReplace of \ to /
This because url can with \ is also valid

RegEx is also valid, but since I live in the Netherlands, the .nl is not included and therefor not found. And there are more countries that I can think of.
Therefor the simple string-slice method, since you're only interested in the first entry after the "://"
Closed Thread

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is On

Forum Jump


All times are GMT. The time now is 09:00 AM.


Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2017, vBulletin Solutions, Inc.