|
|
|
|
|
|
| Author |
Message |
Jordan Abel *nix forums Guru
Joined: 25 Oct 2005
Posts: 1366
|
Posted: Sun Feb 26, 2006 8:01 pm Post subject:
Re: Whats the practical maximum file size using indexed allocation (I nodes)
|
|
|
On 2006-02-26, Logan Shaw <lshaw-usenet@austin.rr.com> wrote:
| Quote: | Jordan Abel wrote:
On 2006-02-24, Gordon Burditt <gordonb.l20vx@burditt.org> wrote:
fd = open("big", O_WRONLY | O_CREAT, 0666);
| O_LARGE
Where is that supposed to be defined? I don't see it in
any source file for FreeBSD (other than as part of some
symbols of the form *_TOO_LARGE, mostly in openssl ).
lseek(fd, offset, 0);
man 2 llseek
No such manual page.
Where can I buy storage, cheap, that needs more than 64 bits for
the length of the file? And what do I need it for? Archiving
the entire contents of Google hourly?
I bet he uses Linux. Linux has traditionally maintained two separate and
parallel APIs, one in which off_t is a 'long' (32 on a 32-bit system),
the other in which it's 64 bits. llseek is one of the underlying system
calls for the latter mode, and it is also sometimes misused (where
lseek64 _should_ be used) to be able to access files in "large file"
mode in the former. O_LARGEFILE (misspelled above as O_LARGE) is a bit
used internally by open64() which is also sometimes misused in the same
way.
Just a data point: on Solaris 8 (and presumably later versions as well
since Sun is anal-retentive about keeping interface compatibility at
both the binary and source levels[1]), the documentation says that
calling open() with O_LARGEFILE is equivalent to calling open64(),
which to me indicates that using O_LARGEFILE with open() is kosher.
Solaris also maintains a parallel set of APIs, so that 32-bit
applications can use either 32-bit or 64-bit file offsets, so Linux
isn't unique in that regard. (I don't even think Linux was first,
but I can't remember.)
- Logan
[1] which, by the way, is a good thing in many cases
|
I won't disagree that binary compatibility is good - but off_t should
have been 64-bit to start with. On FreeBSD, off_t has NEVER been less
than 64 bits. |
|
| Back to top |
|
 |
50295@web.de *nix forums beginner
Joined: 09 Feb 2005
Posts: 32
|
Posted: Sun Feb 26, 2006 7:35 pm Post subject:
Re: Whats the practical maximum file size using indexed allocation (I nodes)
|
|
|
Olumide wrote:
| Quote: | While we're on the subject of file systems, I would like to ask about
hard links.
(1) I know they can only refer to data on the same
volume/filestore/partition. ...
|
Just to confirm, is this correct. Are *NIX directory entries limited to
inodes on the same volume OR partition OR filestore? (I'm not too sure
about the partition bit. I guess directory entries will be limited, if
each partition numbered its inodes from 0 ...) |
|
| Back to top |
|
 |
Logan Shaw *nix forums Guru
Joined: 21 Feb 2005
Posts: 474
|
Posted: Sun Feb 26, 2006 6:58 pm Post subject:
Re: Whats the practical maximum file size using indexed allocation (I nodes)
|
|
|
Olumide wrote:
| Quote: | Whiel we're on the subject of file systems, I would like to ask about
hard links.
(1) I know they can only refer to data on the same
volume/filestore/partition. The question is why? My reasoning (and I
may have read this somewhere a long time ago) is that because each
volume/filestore/partition has a list of inode numbered from 0 or 1 to
whatever, and linking a target name merely creates a new directory
entry that points to the same inode
|
You are on the right track. The correct answer is that there is no
such thing as a "hard link". Or to be more precise, if you do this:
date > a
ln a b
then it is not correct to say that "a" is a file and "b" is a hard
link to it. Both "a" and "b" are on equal standing and there is
no distinction between them. Assuming "a" and "b" do not already
exist, the above two commands are exactly equivalent to this:
date > b
ln b a
Once you have done either of the two sequences of commands (that is,
create "a" first, then link "b" to it, or vice versa), there is no
way to tell which sequence you chose, because there is no difference
in the outcome. Both "a" and "b" will have the same modification
date, etc., etc., because they have the same i-node.
A simple way to think of this is that, in Unix, files don't have
names. Instead, in Unix directories have names *for* files. This
is different than how it works on many other systems (such as the
DOS FAT filesystem), where files have names as part of their own
structure.
Anyway, the point here is the reason a "hard link" can't be on
a different filesystem is the same reason a regular file can't be
on a different filesystem, because they are the same thing anyway.
(And as it turns out, the reason for this is that a directory does
not have the ability to refer to a file on a different filesystem.
Which does make sense if you think about it.)
| Quote: | (2) Why cant direc tories be hard-linked to? After all, this is what
the OS does when it automatically creates the entries "." and ".."
|
There is no technical limitation preventing this from being easy
enough to implement. The problem is, it gets very confusing very
quickly. Consider this:
mkdir foo
mkdir bar
mkdir foo/abc
ln foo/abc bar/abc
cd foo/abc
Now, after that sequence of commands, what should ".." (inside foo/abc)
refer to? ".." refers to the parent directory, right? Well, what is
abc's parent? Is it foo or is it bar? It's both! abc has two parents,
and ".." can only point to one. Which one should you choose? This
just gets ugly really quickly.
- Logan |
|
| Back to top |
|
 |
Logan Shaw *nix forums Guru
Joined: 21 Feb 2005
Posts: 474
|
Posted: Sun Feb 26, 2006 6:45 pm Post subject:
Re: Whats the practical maximum file size using indexed allocation (I nodes)
|
|
|
Jordan Abel wrote:
| Quote: | On 2006-02-24, Gordon Burditt <gordonb.l20vx@burditt.org> wrote:
fd = open("big", O_WRONLY | O_CREAT, 0666);
| O_LARGE
Where is that supposed to be defined? I don't see it in
any source file for FreeBSD (other than as part of some
symbols of the form *_TOO_LARGE, mostly in openssl ).
lseek(fd, offset, 0);
man 2 llseek
No such manual page.
Where can I buy storage, cheap, that needs more than 64 bits for
the length of the file? And what do I need it for? Archiving
the entire contents of Google hourly?
I bet he uses Linux. Linux has traditionally maintained two separate and
parallel APIs, one in which off_t is a 'long' (32 on a 32-bit system),
the other in which it's 64 bits. llseek is one of the underlying system
calls for the latter mode, and it is also sometimes misused (where
lseek64 _should_ be used) to be able to access files in "large file"
mode in the former. O_LARGEFILE (misspelled above as O_LARGE) is a bit
used internally by open64() which is also sometimes misused in the same
way.
|
Just a data point: on Solaris 8 (and presumably later versions as well
since Sun is anal-retentive about keeping interface compatibility at
both the binary and source levels[1]), the documentation says that
calling open() with O_LARGEFILE is equivalent to calling open64(),
which to me indicates that using O_LARGEFILE with open() is kosher.
Solaris also maintains a parallel set of APIs, so that 32-bit
applications can use either 32-bit or 64-bit file offsets, so Linux
isn't unique in that regard. (I don't even think Linux was first,
but I can't remember.)
- Logan
[1] which, by the way, is a good thing in many cases |
|
| Back to top |
|
 |
toby *nix forums addict
Joined: 01 Jul 2005
Posts: 87
|
Posted: Sun Feb 26, 2006 4:22 am Post subject:
Re: Whats the practical maximum file size using indexed allocation (I nodes)
|
|
|
Olumide wrote:
| Quote: | Gordon Burditt wrote:
Hard linking directories can make an AWFUL mess of a filesystem
and serves no useful purpose. That's a good reason not
to allow it.
I suppose soft links to directories are less messy then? Right?
(scratches head)
|
It is conceptually different, although the effect may be the same (if
the target exists). Two hardlinked entities in a filesystem are
indistinguishable, while symlinks have a directedness. Analogies
include duplicate A records versus CNAMEs in DNS, or HTTP Redirect
versus ServerAliases... Each mechanism has its distinct purposes. |
|
| Back to top |
|
 |
Maxim S. Shatskih *nix forums addict
Joined: 02 Apr 2005
Posts: 55
|
Posted: Sun Feb 26, 2006 2:07 am Post subject:
Re: Whats the practical maximum file size using indexed allocation (I nodes)
|
|
|
| Quote: | I suppose soft links to directories are less messy then? Right?
|
Yes, the recurser tools just ignore softlinks. Also softlinks have no problems
going cross-volume.
--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
maxim@storagecraft.com
http://www.storagecraft.com |
|
| Back to top |
|
 |
50295@web.de *nix forums beginner
Joined: 09 Feb 2005
Posts: 32
|
Posted: Sun Feb 26, 2006 1:58 am Post subject:
Re: Whats the practical maximum file size using indexed allocation (I nodes)
|
|
|
Gordon Burditt wrote:
| Quote: |
Hard linking directories can make an AWFUL mess of a filesystem
and serves no useful purpose. That's a good reason not
to allow it.
|
I suppose soft links to directories are less messy then? Right?
(scratches head) |
|
| Back to top |
|
 |
Gordon Burditt *nix forums Guru
Joined: 02 Mar 2005
Posts: 773
|
Posted: Sun Feb 26, 2006 1:22 am Post subject:
Re: Whats the practical maximum file size using indexed allocation (I nodes)
|
|
|
| Quote: | (2) Why cant direc tories be hard-linked to? After all, this is what
the OS does when it automatically creates the entries "." and ".."
In UNIX V7, you could. mkdir() as a system call didn't exist.
And you could make an awful mess where, for example:
/
/.
/./.
/..
/../.
/./..
referred to 6 different directories and the last 5 were reachable
*ONLY* by the paths given. Other than neat ways for viruses
to hide stuff, I don't see what use hard-linking to directories has.
And where .. in such a directory is supposed to point is problematical.
I'm sorry but I dont't get the point you're trying to pass across.
|
Hard linking directories can make an AWFUL mess of a filesystem
and serves no useful purpose. That's a good reason not
to allow it.
Gordon L. Burditt |
|
| Back to top |
|
 |
50295@web.de *nix forums beginner
Joined: 09 Feb 2005
Posts: 32
|
Posted: Sat Feb 25, 2006 11:36 pm Post subject:
Re: Whats the practical maximum file size using indexed allocation (I nodes)
|
|
|
Gordon Burditt wrote:
| Quote: | (2) Why cant direc tories be hard-linked to? After all, this is what
the OS does when it automatically creates the entries "." and ".."
In UNIX V7, you could. mkdir() as a system call didn't exist.
And you could make an awful mess where, for example:
/
/.
/./.
/..
/../.
/./..
referred to 6 different directories and the last 5 were reachable
*ONLY* by the paths given. Other than neat ways for viruses
to hide stuff, I don't see what use hard-linking to directories has.
And where .. in such a directory is supposed to point is problematical.
|
I'm sorry but I dont't get the point you're trying to pass across. |
|
| Back to top |
|
 |
Maxim S. Shatskih *nix forums addict
Joined: 02 Apr 2005
Posts: 55
|
Posted: Sat Feb 25, 2006 3:27 pm Post subject:
Re: Whats the practical maximum file size using indexed allocation (I nodes)
|
|
|
| Quote: | option, right? (After all, each volume/filestore/partition has its own
inode 821.)
|
Correct, and so, allowing cross-volume hardlinks would require storing the
volume name in dirent together with the inode number.
| Quote: | (2) Why cant direc tories be hard-linked to? After all, this is what
the OS does when it automatically creates the entries "." and ".."
|
Because this pollutes the notion of the "parent directory" - what directory is
parent - link1 or link2?
This also can break any tool which does directory recursion.
--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
maxim@storagecraft.com
http://www.storagecraft.com |
|
| Back to top |
|
 |
Brian Raiter *nix forums beginner
Joined: 15 Mar 2005
Posts: 5
|
Posted: Sat Feb 25, 2006 4:26 am Post subject:
Re: Whats the practical maximum file size using indexed allocation (I nodes)
|
|
|
| Quote: | (2) Why can't directories be hard-linked to? After all, this is what
the OS does when it automatically creates the entries "." and ".."
|
They can, but you need root access. Basically, your average Unix
system back then was not equipped to handle arbitrary loops in the
directory hierarchy intelligently. Special code handles "..", but
other circular structures could lead to infinite loops. So, the idea
was that only the superuser should be trusted not to introduce such
structures haphazardly.
b |
|
| Back to top |
|
 |
Gordon Burditt *nix forums Guru
Joined: 02 Mar 2005
Posts: 773
|
Posted: Sat Feb 25, 2006 2:35 am Post subject:
Re: Whats the practical maximum file size using indexed allocation (I nodes)
|
|
|
| Quote: | Whiel we're on the subject of file systems, I would like to ask about
hard links.
(1) I know they can only refer to data on the same
volume/filestore/partition. The question is why? My reasoning (and I
|
How do you refer to data on another filesystem, assuming that there
was such a field present along with the inode number (dev_t, perhaps?
The pair (dev_t, inode) is supposed to be unique in the system.
Not sure what NFS does for the dev_t of a mounted volume, though.)
You have a problem here that filesystems (both the referring and
referred-to filesystems) aren't always mounted in the same place
(or even the same system), and that filesystems are sometimes on
removable media.
Note also that sometimes if you plug in another disk drive, some
of the others get renumbered.
| Quote: | may have read this somewhere a long time ago) is that because each
volume/filestore/partition has a list of inode numbered from 0 or 1 to
whatever, and linking a target name merely creates a new directory
entry that points to the same inode (number 821 for example) as the
source, and because it is possible to have more than 1
volume/filestore/partition avaliable, restricting hard links to inodes
on the the same volume/filestore/partition as the source is the only
option, right? (After all, each volume/filestore/partition has its own
inode 821.)
|
It's NOT the only option, but how do you reasonably refer to another
volume when you don't know if or where it's mounted?
| Quote: | (2) Why cant direc tories be hard-linked to? After all, this is what
the OS does when it automatically creates the entries "." and ".."
|
In UNIX V7, you could. mkdir() as a system call didn't exist.
And you could make an awful mess where, for example:
/
/.
/./.
/..
/../.
/./..
referred to 6 different directories and the last 5 were reachable
*ONLY* by the paths given. Other than neat ways for viruses
to hide stuff, I don't see what use hard-linking to directories has.
And where .. in such a directory is supposed to point is problematical.
Gordon L. Burditt |
|
| Back to top |
|
 |
50295@web.de *nix forums beginner
Joined: 09 Feb 2005
Posts: 32
|
Posted: Sat Feb 25, 2006 1:38 am Post subject:
Re: Whats the practical maximum file size using indexed allocation (I nodes)
|
|
|
Thanks everyone!
Whiel we're on the subject of file systems, I would like to ask about
hard links.
(1) I know they can only refer to data on the same
volume/filestore/partition. The question is why? My reasoning (and I
may have read this somewhere a long time ago) is that because each
volume/filestore/partition has a list of inode numbered from 0 or 1 to
whatever, and linking a target name merely creates a new directory
entry that points to the same inode (number 821 for example) as the
source, and because it is possible to have more than 1
volume/filestore/partition avaliable, restricting hard links to inodes
on the the same volume/filestore/partition as the source is the only
option, right? (After all, each volume/filestore/partition has its own
inode 821.)
(2) Why cant direc tories be hard-linked to? After all, this is what
the OS does when it automatically creates the entries "." and ".."
- Olumide |
|
| Back to top |
|
 |
50295@web.de *nix forums beginner
Joined: 09 Feb 2005
Posts: 32
|
Posted: Sat Feb 25, 2006 12:37 am Post subject:
Re: Whats the practical maximum file size using indexed allocation (I nodes)
|
|
|
Maxim S. Shatskih wrote:
| Quote: | Thanks for your reply Gordon. Just to confirm, are you saying that the
practical maximum size of a file is determined by the size of the read
write pointer i.e. off_t?
Correct. The older UNIXen had the 4GB limit only due to using 32bit types for
off_t, and 32bit type for a "file size" field in the on-disk metadata. Nothing
more.
|
Erm ... don't you mean 2GB as off_t is ... erm .. signed? |
|
| Back to top |
|
 |
Jordan Abel *nix forums Guru
Joined: 25 Oct 2005
Posts: 1366
|
Posted: Fri Feb 24, 2006 6:32 pm Post subject:
Re: Whats the practical maximum file size using indexed allocation (I nodes)
|
|
|
On 2006-02-24, Gordon Burditt <gordonb.l20vx@burditt.org> wrote:
| Quote: | fd = open("big", O_WRONLY | O_CREAT, 0666);
| O_LARGE
Where is that supposed to be defined? I don't see it in
any source file for FreeBSD (other than as part of some
symbols of the form *_TOO_LARGE, mostly in openssl ).
lseek(fd, offset, 0);
man 2 llseek
No such manual page.
Where can I buy storage, cheap, that needs more than 64 bits for
the length of the file? And what do I need it for? Archiving
the entire contents of Google hourly?
|
I bet he uses Linux. Linux has traditionally maintained two separate and
parallel APIs, one in which off_t is a 'long' (32 on a 32-bit system),
the other in which it's 64 bits. llseek is one of the underlying system
calls for the latter mode, and it is also sometimes misused (where
lseek64 _should_ be used) to be able to access files in "large file"
mode in the former. O_LARGEFILE (misspelled above as O_LARGE) is a bit
used internally by open64() which is also sometimes misused in the same
way.
You can apparently #define _FILE_OFFSET_BITS 64 at the top of your
source file to get all the off64_t crap to be transparently used, but
avoid use of any libraries [other than glibc itself, which magically
knows the difference, or, more likely, has _all_ functions that would
use an off_t replaced with alternate versions] that do anything
involving off_t if you do that. There's probably a way around this, but
it gets progressively less sane.
llseek itself takes two longs, high-order first - which I suppose are
probably converted from a long long or "loff_t" in the userspace
version.
I assume this was done because they didn't know better at first, and
wanted to maintain binary compatibility later.
In conclusion, llseek and O_LARGEFILE are for idiots who don't know how
to _really_ use the largefile interface. |
|
| Back to top |
|
 |
Google
|
|
| Back to top |
|
 |
|
|
The time now is Thu Jan 08, 2009 5:37 pm | All times are GMT
|
|
Myspace Proxy | MPAA | Loans | Bankruptcy | Mobile Phone
|
|
Copyright © 2004-2005 DeniX Solutions SRL
|
|
|
|
Other DeniX Solutions sites:
Unix/Linux blog |
electronics forum |
medicine forum |
science forum |
|
|
Privacy Policy
|
Powered by phpBB © 2001, 2005 phpBB Group
|
|