Determining the type of file through header info


DevX Home    Today's Headlines   Articles Archive   Tip Bank   Forums   

Results 1 to 11 of 11

Thread: Determining the type of file through header info

  1. #1
    Jeff Guest

    Determining the type of file through header info


    Does anyone know how to go about reading header info from a file to determine
    if it is a word document, text file, wordperfect document and their versions?
    We have 10,000 documents to iterate and determine their types as many do
    not maintain their native file extensions.

    Thanks,
    Jeff.

  2. #2
    mrfelis Guest

    Re: Determining the type of file through header info

    Most files that aren't text files have a signature string (1-2 characters,
    maybe more) somewhere near the begining of the file.

    You'll need to research the file format of each specific file type your
    interested in. The signatures vary from one doc to the next.

    The OS remembers these things based on the Extension. So if your changing
    the extension, you're most likely out of luck on an API call.

    --
    ~~~
    C'Ya,
    mrfelis
    mrfelis@yahoo.NOSPAM.com
    just remove the spam
    Jeff <quinner@home.com> wrote in message news:3992fe2e$1@news.devx.com...
    >
    > Does anyone know how to go about reading header info from a file to

    determine
    > if it is a word document, text file, wordperfect document and their

    versions?
    > We have 10,000 documents to iterate and determine their types as many do
    > not maintain their native file extensions.
    >
    > Thanks,
    > Jeff.




  3. #3
    mrfelis Guest

    Re: Determining the type of file through header info

    Most files that aren't text files have a signature string (1-2 characters,
    maybe more) somewhere near the begining of the file.

    You'll need to research the file format of each specific file type your
    interested in. The signatures vary from one doc to the next.

    The OS remembers these things based on the Extension. So if your changing
    the extension, you're most likely out of luck on an API call.

    --
    ~~~
    C'Ya,
    mrfelis
    mrfelis@yahoo.NOSPAM.com
    just remove the spam
    Jeff <quinner@home.com> wrote in message news:3992fe2e$1@news.devx.com...
    >
    > Does anyone know how to go about reading header info from a file to

    determine
    > if it is a word document, text file, wordperfect document and their

    versions?
    > We have 10,000 documents to iterate and determine their types as many do
    > not maintain their native file extensions.
    >
    > Thanks,
    > Jeff.




  4. #4
    Lenny Toulson Guest

    Re: Determining the type of file through header info

    Kinda sucks though, doesn't it? I mean, in the bad old days of DOS,
    extensions were OK. Then when Win3x was nothing but a graphical shell on
    top of DOS, there wasn't much to be done.

    But when MS released Win95 as a new "operating system", you'd think they
    would have added support for file headers while maintaining backward
    compatibility for extensions. This is especially so considering that the
    default setting for Windows Explorer is to hide file extensions for "known
    file types." Now that Win2K is out 3 generations later, have they added
    file header support? Sadly, the answer is still no.

    Sorry, just couldn't resist when I saw the soapbox just sitting there...

    --
    Lenny
    __________


    "mrfelis" <mrfelis@yahoo.NOSPAM.com> wrote in message
    news:399410ff@news.devx.com...

    The OS remembers these things based on the Extension. So if your changing
    the extension, you're most likely out of luck on an API call.



  5. #5
    Lenny Toulson Guest

    Re: Determining the type of file through header info

    Kinda sucks though, doesn't it? I mean, in the bad old days of DOS,
    extensions were OK. Then when Win3x was nothing but a graphical shell on
    top of DOS, there wasn't much to be done.

    But when MS released Win95 as a new "operating system", you'd think they
    would have added support for file headers while maintaining backward
    compatibility for extensions. This is especially so considering that the
    default setting for Windows Explorer is to hide file extensions for "known
    file types." Now that Win2K is out 3 generations later, have they added
    file header support? Sadly, the answer is still no.

    Sorry, just couldn't resist when I saw the soapbox just sitting there...

    --
    Lenny
    __________


    "mrfelis" <mrfelis@yahoo.NOSPAM.com> wrote in message
    news:399410ff@news.devx.com...

    The OS remembers these things based on the Extension. So if your changing
    the extension, you're most likely out of luck on an API call.



  6. #6
    Robert Gelb Guest

    Re: Determining the type of file through header info

    Wasn't there an article in MSJ that there is support for file header in
    Win2k? I'll try to search on this. I think it is a matter of vendors
    taking advantage of it.

    --
    Robert Gelb
    www.vbRad.com

    "Lenny Toulson" <ltoulson@nospam.net> wrote in message
    news:39969d12$1@news.devx.com...
    > Kinda sucks though, doesn't it? I mean, in the bad old days of DOS,
    > extensions were OK. Then when Win3x was nothing but a graphical shell on
    > top of DOS, there wasn't much to be done.
    >
    > But when MS released Win95 as a new "operating system", you'd think they
    > would have added support for file headers while maintaining backward
    > compatibility for extensions. This is especially so considering that the
    > default setting for Windows Explorer is to hide file extensions for "known
    > file types." Now that Win2K is out 3 generations later, have they added
    > file header support? Sadly, the answer is still no.
    >
    > Sorry, just couldn't resist when I saw the soapbox just sitting there...


    >
    > --
    > Lenny
    > __________
    >
    >
    > "mrfelis" <mrfelis@yahoo.NOSPAM.com> wrote in message
    > news:399410ff@news.devx.com...
    >
    > The OS remembers these things based on the Extension. So if your changing
    > the extension, you're most likely out of luck on an API call.
    >
    >




  7. #7
    Robert Gelb Guest

    Re: Determining the type of file through header info

    Wasn't there an article in MSJ that there is support for file header in
    Win2k? I'll try to search on this. I think it is a matter of vendors
    taking advantage of it.

    --
    Robert Gelb
    www.vbRad.com

    "Lenny Toulson" <ltoulson@nospam.net> wrote in message
    news:39969d12$1@news.devx.com...
    > Kinda sucks though, doesn't it? I mean, in the bad old days of DOS,
    > extensions were OK. Then when Win3x was nothing but a graphical shell on
    > top of DOS, there wasn't much to be done.
    >
    > But when MS released Win95 as a new "operating system", you'd think they
    > would have added support for file headers while maintaining backward
    > compatibility for extensions. This is especially so considering that the
    > default setting for Windows Explorer is to hide file extensions for "known
    > file types." Now that Win2K is out 3 generations later, have they added
    > file header support? Sadly, the answer is still no.
    >
    > Sorry, just couldn't resist when I saw the soapbox just sitting there...


    >
    > --
    > Lenny
    > __________
    >
    >
    > "mrfelis" <mrfelis@yahoo.NOSPAM.com> wrote in message
    > news:399410ff@news.devx.com...
    >
    > The OS remembers these things based on the Extension. So if your changing
    > the extension, you're most likely out of luck on an API call.
    >
    >




  8. #8
    mrfelis Guest

    Re: Determining the type of file through header info

    Here's an URL for file formats descriptions:

    http://www.wotsit.org/

    Can't say if the formats you want are there.

    --
    ~~~
    C'Ya,
    mrfelis
    mrfelis@yahoo.NOSPAM.com
    just remove the spam
    mrfelis <mrfelis@yahoo.NOSPAM.com> wrote in message
    news:399410ff@news.devx.com...
    > Most files that aren't text files have a signature string (1-2 characters,
    > maybe more) somewhere near the begining of the file.
    >
    > You'll need to research the file format of each specific file type your
    > interested in. The signatures vary from one doc to the next.
    >
    > The OS remembers these things based on the Extension. So if your changing
    > the extension, you're most likely out of luck on an API call.
    >
    > --
    > ~~~
    > C'Ya,
    > mrfelis
    > mrfelis@yahoo.NOSPAM.com
    > just remove the spam
    > Jeff <quinner@home.com> wrote in message news:3992fe2e$1@news.devx.com...
    > >
    > > Does anyone know how to go about reading header info from a file to

    > determine
    > > if it is a word document, text file, wordperfect document and their

    > versions?
    > > We have 10,000 documents to iterate and determine their types as many

    do
    > > not maintain their native file extensions.
    > >
    > > Thanks,
    > > Jeff.

    >
    >




  9. #9
    mrfelis Guest

    Re: Determining the type of file through header info

    Here's an URL for file formats descriptions:

    http://www.wotsit.org/

    Can't say if the formats you want are there.

    --
    ~~~
    C'Ya,
    mrfelis
    mrfelis@yahoo.NOSPAM.com
    just remove the spam
    mrfelis <mrfelis@yahoo.NOSPAM.com> wrote in message
    news:399410ff@news.devx.com...
    > Most files that aren't text files have a signature string (1-2 characters,
    > maybe more) somewhere near the begining of the file.
    >
    > You'll need to research the file format of each specific file type your
    > interested in. The signatures vary from one doc to the next.
    >
    > The OS remembers these things based on the Extension. So if your changing
    > the extension, you're most likely out of luck on an API call.
    >
    > --
    > ~~~
    > C'Ya,
    > mrfelis
    > mrfelis@yahoo.NOSPAM.com
    > just remove the spam
    > Jeff <quinner@home.com> wrote in message news:3992fe2e$1@news.devx.com...
    > >
    > > Does anyone know how to go about reading header info from a file to

    > determine
    > > if it is a word document, text file, wordperfect document and their

    > versions?
    > > We have 10,000 documents to iterate and determine their types as many

    do
    > > not maintain their native file extensions.
    > >
    > > Thanks,
    > > Jeff.

    >
    >




  10. #10
    mrfelis Guest

    Re: Determining the type of file through header info

    Yep. It would have been real nice. Would make things easier considering MS
    now allows periods in the middle of a file name.

    If I were grading MS's homework, they would have gotten a D- for that one!

    --
    ~~~
    C'Ya,
    mrfelis
    mrfelis@yahoo.NOSPAM.com
    just remove the spam
    Lenny Toulson <ltoulson@nospam.net> wrote in message
    news:39969d12$1@news.devx.com...
    > Kinda sucks though, doesn't it? I mean, in the bad old days of DOS,
    > extensions were OK. Then when Win3x was nothing but a graphical shell on
    > top of DOS, there wasn't much to be done.
    >
    > But when MS released Win95 as a new "operating system", you'd think they
    > would have added support for file headers while maintaining backward
    > compatibility for extensions. This is especially so considering that the
    > default setting for Windows Explorer is to hide file extensions for "known
    > file types." Now that Win2K is out 3 generations later, have they added
    > file header support? Sadly, the answer is still no.
    >
    > Sorry, just couldn't resist when I saw the soapbox just sitting there...


    >
    > --
    > Lenny
    > __________
    >
    >
    > "mrfelis" <mrfelis@yahoo.NOSPAM.com> wrote in message
    > news:399410ff@news.devx.com...
    >
    > The OS remembers these things based on the Extension. So if your changing
    > the extension, you're most likely out of luck on an API call.
    >
    >




  11. #11
    mrfelis Guest

    Re: Determining the type of file through header info

    Yep. It would have been real nice. Would make things easier considering MS
    now allows periods in the middle of a file name.

    If I were grading MS's homework, they would have gotten a D- for that one!

    --
    ~~~
    C'Ya,
    mrfelis
    mrfelis@yahoo.NOSPAM.com
    just remove the spam
    Lenny Toulson <ltoulson@nospam.net> wrote in message
    news:39969d12$1@news.devx.com...
    > Kinda sucks though, doesn't it? I mean, in the bad old days of DOS,
    > extensions were OK. Then when Win3x was nothing but a graphical shell on
    > top of DOS, there wasn't much to be done.
    >
    > But when MS released Win95 as a new "operating system", you'd think they
    > would have added support for file headers while maintaining backward
    > compatibility for extensions. This is especially so considering that the
    > default setting for Windows Explorer is to hide file extensions for "known
    > file types." Now that Win2K is out 3 generations later, have they added
    > file header support? Sadly, the answer is still no.
    >
    > Sorry, just couldn't resist when I saw the soapbox just sitting there...


    >
    > --
    > Lenny
    > __________
    >
    >
    > "mrfelis" <mrfelis@yahoo.NOSPAM.com> wrote in message
    > news:399410ff@news.devx.com...
    >
    > The OS remembers these things based on the Extension. So if your changing
    > the extension, you're most likely out of luck on an API call.
    >
    >




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center
 
 
FAQ
Latest Articles
Java
.NET
XML
Database
Enterprise
Questions? Contact us.
C++
Web Development
Wireless
Latest Tips
Open Source


   Development Centers

   -- Android Development Center
   -- Cloud Development Project Center
   -- HTML5 Development Center
   -- Windows Mobile Development Center