niXforums Forum Index
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   PreferencesPreferences   Log in to check your private messagesLog in to check your private messages   Log inLog in 
·  nixdoc.net ·  man pages ·  Linux HOWTOs ·  FreeBSD Tips ·  Forums
navigation Forum index » Programming » Unix internals
Whats the practical maximum file size using indexed allocation (I nodes)
Post new topic   Reply to topic Page 3 of 3 [39 Posts] View previous topic :: View next topic
Goto page:  Previous  1, 2, 3
Author Message
Maxim S. Shatskih
*nix forums addict


Joined: 02 Apr 2005
Posts: 55

PostPosted: Fri Feb 24, 2006 6:24 pm    Post subject: Re: Whats the practical maximum file size using indexed allocation (I nodes) Reply with quote

Quote:
Thanks for your reply Gordon. Just to confirm, are you saying that the
practical maximum size of a file is determined by the size of the read
write pointer i.e. off_t?

Correct. The older UNIXen had the 4GB limit only due to using 32bit types for
off_t, and 32bit type for a "file size" field in the on-disk metadata. Nothing
more.

--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
maxim@storagecraft.com
http://www.storagecraft.com
Back to top
Gordon Burditt
*nix forums Guru


Joined: 02 Mar 2005
Posts: 773

PostPosted: Fri Feb 24, 2006 5:48 pm    Post subject: Re: Whats the practical maximum file size using indexed allocation (I nodes) Reply with quote

Quote:
fd = open("big", O_WRONLY | O_CREAT, 0666);

| O_LARGE

Where is that supposed to be defined? I don't see it in
any source file for FreeBSD (other than as part of some
symbols of the form *_TOO_LARGE, mostly in openssl ).

Quote:

lseek(fd, offset, 0);

man 2 llseek

No such manual page.

Where can I buy storage, cheap, that needs more than 64 bits for
the length of the file? And what do I need it for? Archiving
the entire contents of Google hourly?

Gordon L. Burditt
Back to top
Frank Kotler
*nix forums beginner


Joined: 29 Jun 2005
Posts: 2

PostPosted: Fri Feb 24, 2006 8:33 am    Post subject: Re: Whats the practical maximum file size using indexed allocation (I nodes) Reply with quote

Gordon Burditt wrote:

....
Quote:
fd = open("big", O_WRONLY | O_CREAT, 0666);

| O_LARGE

Quote:
lseek(fd, offset, 0);

man 2 llseek

Dunno if that'll help you or not...

Best,
Frank
Back to top
Gordon Burditt
*nix forums Guru


Joined: 02 Mar 2005
Posts: 773

PostPosted: Thu Feb 23, 2006 11:01 pm    Post subject: Re: Whats the practical maximum file size using indexed allocation (I nodes) Reply with quote

Quote:
Thanks for your reply Gordon. Just to confirm, are you saying that the
practical maximum size of a file is determined by the size of the read
write pointer i.e. off_t?

Among other possible limits. There are others (in combination)
which could make it SMALLER, such as:
The number of bits used in a block number, combined with block size.
(FreeBSD seems to be using AT LEAST 33 bits for block size)
The maximum size storage device available.
The number of bits available in a SCSI command for the block number.
etc.

Quote:
One other issue: According to the man page, "lseek returns the
resulting offset location as measured in bytes from the beginning of
the file." If this is trus, then I suspect that off_t need not be
signed.

Read the part about the second argument of lseek(), in conjunction
with a third argument of SEEK_CUR or SEEK_END. The offset is being
used as a signed number in that situation.

Gordon L. Burditt
Back to top
Måns Rullgård
*nix forums Guru


Joined: 01 Mar 2005
Posts: 754

PostPosted: Thu Feb 23, 2006 10:53 pm    Post subject: Re: Whats the practical maximum file size using indexed allocation (I nodes) Reply with quote

"Olumide" <50295@web.de> writes:

Quote:
One other issue: According to the man page, "lseek returns the
resulting offset location as measured in bytes from the beginning of
the file." If this is trus, then I suspect that off_t need not be
signed.

Hint: SEEK_CUR

--
Måns Rullgård
mru@inprovide.com
Back to top
50295@web.de
*nix forums beginner


Joined: 09 Feb 2005
Posts: 32

PostPosted: Thu Feb 23, 2006 10:32 pm    Post subject: Re: Whats the practical maximum file size using indexed allocation (I nodes) Reply with quote

Gordon Burditt wrote:

Quote:
FreeBSD has used 64-bit file offsets for quite some time. Whether
or not it actually uses triple indirect pointers,
...
off_t (64 bits) is bigger than a long int (32 bits) on FreeBSD on a ia32.
Also remember that for lseek, off_t needs to be *signed*.

Anyway, because the read and write system calls also use this
"pointer", the size of off_t determined the practical maximum
file size on the system in question.

This ismy reasoning. Does it make sense? Is there any other reason why
the theoretical maximum file size is unobtainable?

Yes. There may be insufficient addressing in the hardware devices,
drivers, or controllers.

Thanks for your reply Gordon. Just to confirm, are you saying that the
practical maximum size of a file is determined by the size of the read
write pointer i.e. off_t?

One other issue: According to the man page, "lseek returns the
resulting offset location as measured in bytes from the beginning of
the file." If this is trus, then I suspect that off_t need not be
signed.

- Olumide
Back to top
Gordon Burditt
*nix forums Guru


Joined: 02 Mar 2005
Posts: 773

PostPosted: Thu Feb 23, 2006 7:51 pm    Post subject: Re: Whats the practical maximum file size using indexed allocation (I nodes) Reply with quote

Quote:
'Been reading a few texts - Operating System Concepts [5ed] by
Siberschatz and Galvin, and Operating Systems Concepts: a mordern
perspective [2ed] by Gary Nutt, and according to the later text (page
427), "current versions of BSD UNIX do not use the triple indirect
pointer ... partly because the 32-bit addresses used in the file system
precludes file sizes larger than 2Gb".

FreeBSD has used 64-bit file offsets for quite some time. Whether
or not it actually uses triple indirect pointers, I don't know, but
you can get some *very* big files that won't be handled by only
double-indirect. And if you're willing to deal with files that
have unallocated holes in them, you can have files big enough to
need a triple-indirect block that fit *on a floppy*.

Quote:
The former text shares the same/similar view and states (on page 380)
that: "the number of blocks that can be allocated to a file exceeds the
amount of space addressable by the 4-byte file pointers ..."

Does this mean that the theoretical maximum file size of approx 16Gb
(assuming 1kb disk blocks) cannot be achived on a 32-bit system? ...

What's a "32-bit system"? A 32-bit int, or a 32-bit long, does not
imply a 32-bit off_t. FreeBSD running on an ia32 (Pentium) processor
is generally considered 32-bit, but that's not what it uses for the
file system. MS-DOS ran on a 16-bit system (8086 processor) but
it was never limited to a maximum file size of 64K.

Quote:
I'm trying to get my mind round this, and this is is what I've come up
with so far:

First of all, I dont see the read or write system calls failing since
they return the amount of bytes read or written per call. However,
lseek would be a problem because lseek returnes the position of the
read/write pointer of the file descriptor - the maximum size of which
is off_t (dunno what this is long int?).

off_t (64 bits) is bigger than a long int (32 bits) on FreeBSD on a ia32.
Also remember that for lseek, off_t needs to be *signed*.

Quote:
Anyway, because the read and write system calls also use this
"pointer", the size of off_t determined the practical maximum
file size on the system in question.

This ismy reasoning. Does it make sense? Is there any other reason why
the theoretical maximum file size is unobtainable?

Yes. There may be insufficient addressing in the hardware devices,
drivers, or controllers. For example, a decade or two ago there
were problems with hard disks having more than 1024 cylinders because
the controller hardware didn't have enough bits in the registers.
The number of bytes you can put in a SCSI command can be a limitation.


I wrote this silly program on FreeBSD 6.0, and ran it. It seeks
1T into the file, writes one byte, seeks another 1T into the file,
and writes one byte, repeat until it fails. It failed with
write: file too large

% ls -lsh /tmp/big
6160 -rw-rw-r-- 1 root wheel 128T Feb 23 13:23 /tmp/big
% ls -ls /tmp/big
6160 -rw-rw-r-- 1 root wheel 140737488355456 Feb 23 13:23 /tmp/big
%

Let's see, this file takes up 6160K of actual disk space. The
filesystem block size is 16K. There were 128 writes of 1 byte, but
each write occupies one data, one single-indirect, and one
double-indirect (filesystem) block. That leaves 16K left over for
a triple-indirect block, which is exactly what was used.


#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>

int
main(void)
{
int fd;
off_t offset;
int ret;

offset = 1024; /* 1K */
offset *= 1024; /* 1M */
offset *= 1024; /* 1G */
offset *= 1024; /* 1T */

fd = open("big", O_WRONLY | O_CREAT, 0666);
lseek(fd, offset, 0);
while (1)
{
ret = write(fd, "A", 1);
if (ret < 0)
{
perror("write");
break;
}
lseek(fd, offset, 1);
}
close(fd);
exit(0);
}

Gordon L. Burditt
Back to top
Alexei A. Frounze
*nix forums Guru Wannabe


Joined: 10 Apr 2005
Posts: 243

PostPosted: Thu Feb 23, 2006 4:30 am    Post subject: Re: Whats the practical maximum file size using indexed allocation (I nodes) Reply with quote

"Olumide" <50295@web.de> wrote in message
news:1140667236.316954.269240@f14g2000cwb.googlegroups.com...
Quote:
Hi -

'Been reading a few texts - Operating System Concepts [5ed] by
Siberschatz and Galvin, and Operating Systems Concepts: a mordern
perspective [2ed] by Gary Nutt, and according to the later text (page
427), "current versions of BSD UNIX do not use the triple indirect
pointer ... partly because the 32-bit addresses used in the file system
precludes file sizes larger than 2Gb".

The former text shares the same/similar view and states (on page 380)
that: "the number of blocks that can be allocated to a file exceeds the
amount of space addressable by the 4-byte file pointers ..."

Does this mean that the theoretical maximum file size of approx 16Gb
(assuming 1kb disk blocks) cannot be achived on a 32-bit system? ...
I'm trying to get my mind round this, and this is is what I've come up
with so far:

First of all, I dont see the read or write system calls failing since
they return the amount of bytes read or written per call. However,
lseek would be a problem because lseek returnes the position of the
read/write pointer of the file descriptor - the maximum size of which
is off_t (dunno what this is long int?).

Anyway, because the read and write system calls also use this
"pointer", the size of off_t determined the practical maximum
file size on the system in question.

This ismy reasoning. Does it make sense? Is there any other reason why
the theoretical maximum file size is unobtainable?

I think the reason could be different. Files can be mapped to memory --
that's very handy and modern OSes usually support that and benefit from such
a feature themselves. But files, whose size exceeds the accessible address
space size can't be mapped in whole. That could be the reason why. At least,
this reason is more reasonable than sizeof(int)...

Alex
Back to top
50295@web.de
*nix forums beginner


Joined: 09 Feb 2005
Posts: 32

PostPosted: Thu Feb 23, 2006 4:00 am    Post subject: Whats the practical maximum file size using indexed allocation (I nodes) Reply with quote

Hi -

'Been reading a few texts - Operating System Concepts [5ed] by
Siberschatz and Galvin, and Operating Systems Concepts: a mordern
perspective [2ed] by Gary Nutt, and according to the later text (page
427), "current versions of BSD UNIX do not use the triple indirect
pointer ... partly because the 32-bit addresses used in the file system
precludes file sizes larger than 2Gb".

The former text shares the same/similar view and states (on page 380)
that: "the number of blocks that can be allocated to a file exceeds the
amount of space addressable by the 4-byte file pointers ..."

Does this mean that the theoretical maximum file size of approx 16Gb
(assuming 1kb disk blocks) cannot be achived on a 32-bit system? ...
I'm trying to get my mind round this, and this is is what I've come up
with so far:

First of all, I dont see the read or write system calls failing since
they return the amount of bytes read or written per call. However,
lseek would be a problem because lseek returnes the position of the
read/write pointer of the file descriptor - the maximum size of which
is off_t (dunno what this is long int?).

Anyway, because the read and write system calls also use this
"pointer", the size of off_t determined the practical maximum
file size on the system in question.

This ismy reasoning. Does it make sense? Is there any other reason why
the theoretical maximum file size is unobtainable?

Thanks,

- Olumide
Back to top
Google

Back to top
Display posts from previous:   
Post new topic   Reply to topic Page 3 of 3 [39 Posts] Goto page:  Previous  1, 2, 3
View previous topic :: View next topic
The time now is Thu Jan 08, 2009 1:21 pm | All times are GMT
navigation Forum index » Programming » Unix internals
Jump to:  

Similar Topics
Topic Author Forum Replies Last Post
No new posts Running php file everyday on scheduled time sachin PHP 1 Fri Jul 21, 2006 12:49 pm
No new posts Regarding thesaurus iso file Srikanth modules 0 Fri Jul 21, 2006 10:42 am
No new posts how can i get a file descriptor not used? mars system 0 Fri Jul 21, 2006 7:41 am
No new posts small GTK "Open file" dialog David Siroky Debian 0 Fri Jul 21, 2006 7:30 am
No new posts Trouble Declaring 3D Array in Header File free2klim C++ 1 Fri Jul 21, 2006 4:07 am

The eBay Song | Fast Loans | Web Hosting | Banruptcy | Online MBA Degree
Copyright © 2004-2005 DeniX Solutions SRL
 
Other DeniX Solutions sites: Unix/Linux blog |  electronics forum |  medicine forum |  science forum | 
Privacy Policy


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 1.8066s ][ Queries: 20 (1.4548s) ][ GZIP on - Debug on ]