Help extracting text from a text file in VB


DevX Home    Today's Headlines   Articles Archive   Tip Bank   Forums   

Results 1 to 3 of 3

Thread: Help extracting text from a text file in VB

Hybrid View

  1. #1
    Join Date
    May 2008
    Posts
    10

    Help extracting text from a text file in VB

    Hello,

    Using VB 2008, I need help extracting text from a loaded text file. I turn the text into a string and know I will need to use split() in there but I'm unsure how to go about this properly and was hoping someone could point me to some code snippets or let me know which functions I should be looking at and how to approach this.

    The text will be of varying length and I need to strip out certain information. In this example, the information needed will be between two tags.. ie <x>sometext</x>
    <y>sometext2</y>
    <z>somesomesometext</z>

    I need to get all of the text in between those tags. Additionally, the file will be of varying lengths, so the tags can repeat an unlimited # of times.

    In this file the text is xml however I will need to do the same on other files which will not be xml, therefore using some of the xml parsing code I've found wouldn't work.

    Can anyone help out?

    TIA.

  2. #2
    Join Date
    Nov 2003
    Location
    Portland, OR
    Posts
    8,387
    If the file is XML, you should use XML/XPath:
    Code:
    ' Imports System.Xml
    ' Imports System.Xml.XPath
    
    Dim Document As New XmlDocument()
    Document.Load(FileName)
    
    ' Find all the text between <x>...</x> tags
    Dim Nodes As XmlNodeList = Document.SelectNodes("//x")
    For Each Node As XmlNode In Nodes
        Console.WriteLine(Node.InnerText)
    Next
    For unstructured text, use regular expressions:
    Code:
    ' Imports System.IO
    ' Imports System.Text.RegularExpressions
    
    Dim StartText As String = "<x>"
    Dim EndText As String = "</x>"
    
    Dim Text As String
    Using sr As StreamReader = File.OpenText(FileName)
        Text = sr.ReadToEnd
    End Using
    
    Dim Pattern As String = StartText & "(.+)" & EndText
    Dim Matches As MatchCollection = Regex.Matches(Text, Pattern)
    
    For Each m As Match In Matches
        Console.WriteLine(m.Groups(1).Value)
    Next
    Last edited by Phil Weber; 05-10-2008 at 06:06 PM.
    Phil Weber
    http://www.philweber.com

    Please post questions to the forums, where others may benefit.
    I do not offer free assistance by e-mail. Thank you!

  3. #3
    Join Date
    May 2008
    Posts
    10
    Phil,

    Thank you for your help... took some time for me to get it down right but it works like a charm.

Similar Threads

  1. read characters in a text file using vb
    By nobbyap in forum VB Classic
    Replies: 6
    Last Post: 11-26-2008, 04:44 AM
  2. Importing text file using schema.ini
    By Kevin in forum VB Classic
    Replies: 3
    Last Post: 12-05-2005, 07:25 PM
  3. .bat file exuted from VB
    By David in forum VB Classic
    Replies: 1
    Last Post: 09-04-2001, 08:38 AM
  4. Double Text 1.0
    By George Gilbert in forum vb.announcements
    Replies: 0
    Last Post: 08-19-2001, 12:34 PM
  5. Replies: 0
    Last Post: 04-17-2000, 02:33 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center
 
 
FAQ
Latest Articles
Java
.NET
XML
Database
Enterprise
Questions? Contact us.
C++
Web Development
Wireless
Latest Tips
Open Source


   Development Centers

   -- Android Development Center
   -- Cloud Development Project Center
   -- HTML5 Development Center
   -- Windows Mobile Development Center